On the complexity of assimilation in urban communities

Introduction Cities have long represented the hopes and aspirations of people. Migration from rural towns and villages to urban centers has been a centuries-long human endeavor with the earliest documented records of migration being available through historical and pre-historical studies of migration (Gugliotta 2008; Hollingsworth 1970). Migration in the middle ages and later, during the industrial revolution picked up pace as more and more people flocked to urban areas in search of new modes of employment (Williamson 2002). The largest spike in migration happened during the twentieth century (Lall et al. 2006) with newer and more efficient modes of transportation, diversity of employment and increased opportunities for socialization. The trend continues today, with a recent UN report predicting that 68% of the world’s population will live in cities by the year 2050 (UN DESA 2018). The task of providing infrastructure and services for the billions of people residing, working and seeking leisure in cities is gargantuan. Cities are faced with a constant influx of people arriving from various geographical and cultural backgrounds, with varying levels of affinities and similarities to the native and existing populations. For example, New York City is home to little more than 8 million people concentrated in a space of roughly three hundred square miles. The city widely varies with regards to the distribution of income, education, ethnicities and language proficiencies of its people (Karpati et al. 2004; Shmool et al. 2014). The presence of distinct enclaves of communities such Abstract Cities are microcosms representing a diversity of human experience. The complexity of urban systems arises from this diversity, where the services that cities offer to their inhabitants have to be tailored for their unique requirements. This paper studies the complexity of urban environments in terms of the assimilation of its communities. We examine the urban assimilation complexity with respect to the foreignness between communities and formalize the level of complexity using information-theoretic measures. Our findings contribute to a sociological perspective of the relationship between urban complex systems and the diversity of communities that make up urban systems.

of high-density urban centers in Asia called mega-conurbation, such as development in the Yangtze Delta region. Further work on understanding the city not as a top-down system, but instead as a bottom-up system characterized by a constant state of non-equilibrium is presented in Batty and Marshall (2012).
Literature in urban complexity has offered various theories to analyze specific aspects of complexity. In Ortman et al (2020), the authors employ a framework called the Settlement Scaling Theory, which considers cities as social networks embedded in physical space. This theory considers the interactions between people occurring during the exchange of goods, services and information as the foundational construct, mediated by proximity of physical space as found in cities, to be the creation of social networks. Work in Edelenbos et al. (2018) leverages the Actor Network Theory to study cities as complex, adaptive, self-organizing systems. By characterizing a city as a structure of zones, work in Batty et al (2014) studies the entropy and information-theoretic complexity of cities.
In contrast to approaches that analyze complexity through design of networks, our work in this paper studies the complexity of urban societies in terms of the assimilation of immigrant communities with the native community. We explore this complexity by examining the notion of foreignness of a community. Borrowing upon existing literature in assimilation that seeks to identify the factors driving assimilation, we define foreignness as the divergence of the immigrant community from the native community along the lines of education, income and language proficiency. Immigrant communities pose important questions with regards to assimilation. Specifically, in this paper, we investigate the factors that contribute to the foreignness of an immigrant community. Further, we examine if there exist some factors that are more important than others for assimilation, and study their impact on the rate of assimilation of communities. Although our study examines assimilation for immigrant and native communities, the findings of our work can be extended to online communities, where the foreignness can be measured in terms of themes, ideologies or actions that bind together virtual communities.
The rest of this paper is organized as follows. Section 2 describes related work in the areas of assimilation and complexity. In Sect. 3, we introduce our model and derive measures of information-theoretic complexity. Section 4 presents the findings of our simulation, and Sect. 5 offers a discussion of the implications of our findings, and presents the limitations of our model. Section 6 concludes our paper and presents directions for future work.

Related work
This section presents a brief overview of existing work in the rich field of assimilation as found in sociological literature. Existing work in Park (1950) describes some of the earliest work in the rich field of assimilation literature, where the author studied the process of assimilation in terms of race relations as a sequence of stages: contact, competition, accommodation and eventual assimilation. In Gordon (1964), the author studied various forms of assimilation including acculturation, where the minority group adopts the culture of the majority group, and structural assimilation which describes the development of affinity between minority and majority groups. Spatial assimilation theory, proposed in Massey and Denton (1985), studies assimilation between communities in terms of the geographical proximity between ethnic groups and majority groups. Factors involved in assimilation include generational distance from immigrant ancestors, level of education, income and language. This classical assimilation model built on the spatial assimilation theory was modified in South et al (2005), where the authors investigated how the findings of the traditional SAT model could not fully explain how certain communities assimilated more completely than others. In Burgers and Lugt (2006), the authors study motives for suburbanization through a study of integration of the Surinamese in Dutch communities. The work offers a place stratification model that complements the SAT model and explains assimilation barriers for certain groups.
Assimilation of immigrant communities has been further explored in terms of specific aspects of the assimilation experience such as the effect of partner's nationality on residential location of immigrants Ellis et al (2006), through examination of location choices of migrant nest-leavers (Zorlu and Mulder 2010), poverty among foreign-born population (Jargowsky 2009). Unlike related papers that focused on spatial patterns of assimilation of first-generation immigrants, additional work in Ellis & Wright (2006) focused on household types of first and second generation of immigrants. This work found that while the second generation tends to move away from the first generation, the third generation stays closer to the second generation resulting in less geographic dispersion.
These patterns indicate that assimilation occurs over time, with certain predisposing factors that cause certain communities to assimilate more than others. However, assimilation is not uniform (Pamuk 2004;Myles and Hou 2004). The presence of Chinatowns and Little Italy in cities around the world denote the ability of immigrant communities to maintain ties to aspects of their immigrant culture. By selectively choosing the customs and traditions that immigrants can adhere to, the presence of such enduring legacies of immigrant culture offers options for immigrants to be able to assimilate and isolate simultaneously. The next section describes our model for studying informationtheoretic complexity of assimilation in urban communities.

Model
We consider two immigrant communities M 1 and M 2 and their assimilation in a native community N . All of these communities are situated in a broader community S for reference. Figure 1 shows the nature of assimilation between the immigrant communities M j and the native community N . Figure 1a shows some assimilation between the communities as evidenced by the overlap. Figure 1b denotes the two immigrant communities assimilating with each other, but not with the native community, whereas Fig. 1c shows the three communities with no assimilation between them. Finally, Fig. 1d shows assimilation of a single immigrant community with the native one, while the other immigrant community remains unassimilated. This aversion to assimilation is a form of resistance, that can be exhibited by both immigrant and native communities. The immigrant communities M j and the native community may manifest resistance toward the assimilation effort. For simplicity, consider the upper (R = ∞) and lower bounds (R = 0) of this resistance. Figure 2 shows the bidirectional resistance between an immigrant community and the native community in terms of the upper and lower bounds.
We investigate the impact of three attributes-education (e) , income (i) and language (l)-of each of these three communities on the assimilation process (Massey and Denton 1985). Specifically, we study the relationship between an immigrant community M j , jε{1, 2} and the native community N in terms of the difference in education, income and language.

Education and income
Communities M j and N have a median education attainment, e, which is lesser than, equal to or greater than that of the larger community S. We assign values to these levels as − 1 (lesser than S ), 0 (equal to S ) or 1 (greater than S ). Thus, the absolute difference between educational attainment of communities M j and N can be 0 (equal educational attainment), 1 (moderate difference in educational attainment) or 2 (high difference in educational attainment). Any difference other than zero denotes a divergence from the educational attainment norms of the broader community S and is a factor impacting the foreignness of the immigrant community. Similarly, we assign values to the median income levels of the communities M j and N . Again, the absolute difference between median income, i, levels of communities M j and N can be 0 (equal median income level), 1 (moderate difference in income) or 2 (high difference in income).

Language
Next, we assign a binary value to the language proficiency, l , of community M j . Community M j can have similar language proficiency (0), or lower language proficiency (− 1) than the native community N . Thus, the absolute difference in language proficiency of community M j and that of N is 0 or 1.

Foreignness
We develop a metric of foreignness, µ, that denotes the foreignness or the distance between communities M j and N . The observable construct of foreignness replaces the theoretical construct of resistance (R) introduced earlier in the section.
The values α, β, and γ in Eq. (1) denote the importance of the attributes of education, income and language. We denote the tuple (α, β, γ ) as the weights of the attribute tuple (e, i, l) . The values α, β, and γ are chosen such that α + β + γ = 1.
From Eq.
(1), we see that as µ → 0 , the level of foreignness disappears, causing the immigrant communities M j to assimilate completely with the native community N . Conversely, a non-zero value of µ denotes a foreignness that impacts the assimilation of the migrant community M j with the native community N . The upper bound of foreignness is when the communities M j and N differ the most in the attributes e = 2, i = 2, l = 1 with highest weights assigned through the tuple (α, β, γ ), where {α + β + γ = 1}. Conversely, the lower bound of foreignness is zero with equal e, i, l values and zero weight assigned through the tuple {α, β, γ = 0}. Thus, the lower and upper bounds of the theoretical construct of the ******resistance Rε{0, ∞} between communities M j and N is replaced by the observable variable of foreignness with lower and upper bounds given by µε{0, 2}.
We model the difference of attribute distribution e, i, l between each immigrant community represented M j and the native community N as a normal distribution with the mean denoted by the foreignness and a variable standard deviation.
Thus, the assimilation between community M j and N is denoted as.
where µ j = α j e j + β j i j + γ j l j . Represented thus, we can find the relative entropy (Cover and Thomas 2001), also known as the Kullback-Leibler divergence between the assimilation distribution of communities X j → N . The relative entropy or the KL-divergence is given by The relative entropy between the assimilation distributions is a measure of the distance in the assimilation levels between the two communities represented by M j , and represents the impact of the tuples of attributes {e, i, l) and their respective weights {α, β, γ }.
Finally, we observe the complexity of assimilation C for a community M . Complexity has been defined in Lopez-Ruiz et al (2010) as a product of the entropy and the disequilibrium, given by Substituting Eq. (3) in Eq. (4), where p(x) is the probability distribution of the random variable X. we get the assimilation complexity C as follows: Netw Sci (2021) 6:57 where, erf (z) = 2 √ π z ∫ 0 e −t 2 dt is the Gaussian error function.
Next, we model the Renyi divergence of the assimilation between the immigrant communities X j and the native community Y . The Renyi divergence, introduced by Renyi (1961), measures the distance between distributions. The Renyi divergence between two distributions P and Q is denoted as When α = 1, the Renyi divergence, D 1 (P||Q), is the same as the relative entropy of the KL-divergence. For α = 2, the Renyi divergence, D 2 (P||Q), is the logarithm of the expected value of the ratio of the probability distributions. The analog counterpart of the Renyi divergence between two normal distributions ( µ i , σ i ) and µ j , σ j has been presented in Gil et al (2013), and takes the following form: where, σ 2 * α = ασ 2 j + (1 − α)σ 2 i . The next section presents our findings of the relative entropy ( KL-divergence), the complexity and the Renyi divergence of assimilation for our model. Figure 2 shows the relationship between two immigrant communities M 1 and M 2 and the native community Y in terms of the relative entropy, also known as the Kullback-Leibler divergence (Eq. 3). Each immigrant community possesses attributes of education, income and language that are indicative of their level of assimilation with the native community. Each of these attributes is ascribed a weight denoted by the tuple (α, β, γ ) that denote the level of education, income and language respectively in the community. The attribute weights for communities M 1 and M 2 for Fig. 2 are shown in Table 1. Community M 2 has equally weighted attributes for education, income and language proficiency.

Results
The values of the variables denoting the assimilation propensity of the migrant community M j , jε{1, 2} with the native community N are investigated through the following
From Fig. 2, we see that as the standard deviation σ increases, the relative entropy of assimilation of the two communities decreases. This is because, with an increase in the standard deviation, the communities are spread out and thus, the distance between the communities decreases. Further, we see that with random weights denoted by the tuple (α, β, γ ) that denote the level of education, income and language respectively, the relative entropy is the greatest. This follows from the fact that randomness increases the entropy, which holds true for relative entropy as well. Figure 3 shows the complexity of assimilation of an immigrant community with the native community, as denoted by Eq. (4). For this figure, we focus on the assimilation complexity of a single migrant community M 1 , since the findings can be generalized to other communities. We see that with an increase in the standard deviation, the assimilation complexity increases since the individual factors impacting the assimilation are more spread out. The assimilation complexity is lowest when the language proficiency impacts the foreignness the most, since the language proficiency can assume one of two values-high or low. On the other hand, income and education can each assume one of three values, leading to a greater impact on the assimilation complexity. When the tuple of weights (α, β, γ ) is distributed equally, the impact on the assimilation complexity is rendered solely by the randomness of the education, income and language proficiency levels. As with Fig. 2, we see that randomness results in the highest complexity of assimilation. Overall, high education and high income caused a higher impact on the complexity of assimilation than the level of language proficiency. Figure 4 shows the complexity as a function of the foreignness (Eq. 5) of the immigrant community. The level of foreignness is measured as a distance of the immigrant community from the native community along the lines of language proficiency, education and income. The higher the divergence of the immigrant and native communities along these lines, the greater is the foreignness. As expected, an increase in the foreignness increases the complexity of assimilation of the immigrant community with the native community. eε{0, 1, 2}, iε{0, 1, 2}, lε{0, 1} Figure 5 plots the Renyi divergence (Eq. 7) against the standard deviation, σ 2 , of the immigrant community X 2 for varying values of the foreignness (µ 1 , µ 2 ) of the communities X 1, X 2 . In Fig. 5a, the mean of the foreignness distributions for both communities, µ 1 and µ 2 are equal to 0. We see that, as the value of σ 1 approaches σ 2 , the Renyi divergence is closer to zero. The Renyi divergence is equal to zero at the point where σ 1 = σ 2 = 1 . The highest magnitude of the divergence is seen when σ 1 = 5 and σ 2 is closer to 0. These trends change when the foreignness distributions for both communities, µ 1 and µ 2 , are both not equal to zero (Fig. 5b-d). We see that for (µ 1 = 0, µ 2 = 1) in Fig. 5b, the Renyi divergence is greatest for σ 1 = 0.5, σ 2 → 0 , and it progressively reduces to zero for increasing values of σ 2 . However, the tendency of the σ 1 = 5 line to stay on the opposite side of the X-axis compared to the σ 1 = 0.5 line remains the same as in Fig. 5a. Also, for σ 1 = σ 2 = 1 , the Renyi divergence is 0. This shows that, irrespective of the values of the mean of foreignness, the Renyi divergence tends to 0 if σ 1 = σ 2 = 1.  Fig. 5b showed the results of the Renyi divergence when µ 1 > µ 2 , Fig. 5c presents the results when µ 1 < µ 2 . The overall trends from Figures xa and xb continue with regard to the σ 1 = 0.5 and σ 2 = 5 being on the opposite sides of the X-axis. However, for σ 1 = 0.5 , we see that the magnitude of the Renyi divergence increases significantly (almost double in the initial spike) when σ 2 = 0 compared to Fig. 5a, b. Figure 5d continues the patterns seen in Figs. 5a-c. The Renyi divergence still exhibits its highest value (doubling the spike from Fig. 6c) when σ 1 = 0.5 and σ 2 = 0.5.

Fig. 3 Complexity of assimilation
The Renyi divergence results from Fig. 5 shows that the Renyi divergence of the distributions describing the communities decreases to zero as the standard deviations σ i and σ j increase. This points to the tendency where the immigrant communities start to assimilate increasingly with the native community, as the standard deviation increases, eventually turning into a single native homogenous community. The findings are in line with work in Salvati and Carlucci (2020) that show rural areas as having low complexity and diversification.

Discussion and limitations
The study of assimilation between communities is a historical one, and has been fraught with socio-economic, cultural and political connotations. The work in this paper studied the complexity of urban assimilation along the lines of differences in education, income and language. Our findings have implications for understanding assimilation in different kinds of communities: 1. Homogeneous communities: Communities with little or no outsider influence of the kinds represented by geographically or culturally isolated communities differ significantly from a potential immigrant community with respect to education, income and language. These kinds of communities are exemplified by rural communities, suburban and urban enclaves with a predominant demographic population and even online communities that are built around thematic notions. Consequently, there is greater resistance to assimilation with such communities, and therefore exhibit higher complexity of assimilation. 2. Heterogeneous communities: Urban communities are characterized by the "melting pot" feature, that lets communities that widely differ in their education, income and language to coexist and assimilate more than in homogeneous communities. The Fig. 6 Assimilation of communities a subsets of the native community, b complete assimilation complexity of assimilation in such communities is lower than that in homogeneous communities, since heterogeneous communities exhibit less resistance to assimilation both from and toward individuals from other communities. This explains the existence of cultural mainstays in large cities such as Chinatown, Little Italy, Koreatowns and Little India communities. Figure 6 shows various kinds of assimilation scenarios. Figure 6a depicts the immigrant communities as subsets of each other, whereas Fig. 6b depicts complete assimilation of the immigrant communities with the native community. In each case, assimilation create unique circumstances for developing policies and services that address the needs of the communities. 3. Impact of language, education and income: Of the three features that we studied, language acquisition exhibited lowest complexity of assimilation. This could be attributed to the binary coding of language acquisition in our study-communities either had language skills or did not. In our overtly simplified model that considers only three parameters of education, income, and language which are mutually exclusive, language acquisition showed a faster pathway to assimilation than education or income. Consequently, the acquisition of language skills of the native community could help in the reduction of the foreignness factor of an immigrant community relative to the native community. The impact of language acquisition is amplified when seeking assimilation with homogeneous communities, and mitigated with seeking amplification with heterogeneous communities with diverse language acquisition proficiencies in the various communities. 4. Relationships between immigrant communities: Our study used the KL-divergence to examine how two different immigrant communities that differ from each with respect to education, income and language fare in their efforts to assimilate with the native community. Our findings showed that a higher deviation in each of these communities signaled a lower value of the KL-divergence indicating that the distance between communities is lower. This follows from the higher deviation of communities with respect to education, income and language denoting that the communities are more spread out. The foreignness of these two immigrant communities relative to each other is lower leading to two different communities that find common ground in their assimilation efforts with the native community. 5. Design of smart cities: The findings of urban complexity have implications in designing cities of the future (Ekman 2018). For example, work in Fernández-Güell et al (2016) describes the use of urban complexity and diversity in envisioning smart cities that understand the urban environment holistically. Additional work on using complexity theory in policy planning initiatives for urban environments is seen in Innes and Booher (2000).
Our study addresses urban assimilation complexity in a system of two immigrant communities and a single native community. The limitations of our study include the following: 1. Macro factors: The work in this paper examined the complexity of assimilation in urban communities with respect to the education, income and language-all of which have been coded as mutually independent variables that take on discrete val-ues in a limited range. These factors, identified in previous SAT literature describe prominent factors in the assimilation process. Assimilation, however is a complex phenomenon that is dependent on broader factors derived from socio-economic and political influences. The impact of these macro factors on the assimilation process requires complex models to understand the evolution of urban environments. 2. Biases: In addition to the above-mentioned macro factors, individual and community-wide biases may influence the assimilation efforts of immigrant and native communities. These biases have been studied in terms of perceptions of immigrants and the impending policies and frameworks enacted by native communities for adjudication in immigration courts (Marouf 2010), anti-immigration bias (Wagner et al 2010) and media bias (McKeever et al 2012). 3. Coding of attributes: The work in this paper codes difference in education, income and language in discrete levels. For example, language differences were coded as a binary variable. In practice, these attribute differences lie on a spectrum. Further, the assumption of the normal distribution for education, income and language might be refined by using different distributions for each of these attributes. An understanding of the impact of these factors on assimilation complexity will benefit from enhanced models that reflect the diversity of communities in urban environments.

Conclusions
We addressed urban complex systems in terms of the widespread phenomenon of assimilation that is prevalent in urban communities. This paper presents a novel theoretical study of the general problem of assimilation between immigrant communities and native communities using the notion of information-theoretic complexity measures. The arrival of immigrants and the formation of immigrant communities alters the urban landscape in several domains. The consequences of assimilation impact policy and decision-making for offering services, and also affect the culture and community structure of the urban environment. However, assimilation is not uniform. While certain immigrant communities assimilate faster, others experience and exhibit resistance to assimilation. Our work offers an information-theoretic view of the complexity of assimilation in urban environments. Using information-theoretic measures of complexity, we showed that assimilation is impacted differently by education, income and language. An informationtheoretic view of complexity of assimilation in urban environments provides multiple avenues for further research, such as the role of macro and micro factors that impact assimilation. We envision several application problems in urban community structures that can benefit from information-theoretic measures such as the capacity of channels of communication in immigrant and native communities, and the development of mechanisms to address the information asymmetries present in complex urban environments.