Skip to main content

The complex network patterns of human migration at different geographical scales: network science meets regression analysis

Abstract

Migration’s influence in shaping population dynamics in times of impending climate and population crises exposes its crucial role in upholding societal cohesion. As migration impacts virtually all aspects of life, it continues to require attention across scientific disciplines. This study aims to bridge the gap between theoretical understanding and practical application by integrating network analysis and regression methodologies within Migration Studies. In the study we employ network analysis to elucidate migration patterns at various geographical scales-city, country, and global. Additionally, regression analysis is discussed on an exploratory level, where we focus on the underlying factors driving migration, and identifying the key independent variables to enhance predictive accuracy. The study exposes distinct migration network structure and its features, and the consequences these have on conventional regression analysis applications. We conclude on the importance of methodological coherence and disciplinary integration, and highlight the avenues for enhancing the predictive power of migration models.

Introduction

It may seem that the topic of human migration came to the forefront of political, mediatic, public, or scientific discourse only recently, with the severe displacement of Syrians in 2015 and the ongoing “migration crisis”, exacerbated by the war in Ukraine and the new refugee fluxes. Yet, migration is an inherent aspect of human existence, that always has, and always will be of consequence for societal well-being. Whether it’s internal migration within a country, or international migration between countries, the relocation of people, coupled with natural population changes through births and deaths, shapes our societies at various scales. The changes brought about by migration can have a severe impact on the societal order that should be maintained to ensure a harmonious and stable life for individuals, and the overall societal stability.

There are strong predictions that the “migration crisis” is to be exacerbated by the “climate crisis” that works in parallel, and rapidly intensifies (Clement et al. 2021), so the social cohesion issues might be expected to become even more mainstream. Moreover, the forthcoming “population crisis”, which is a result of two thirds of all people living in countries where the average birth rates are lower than the replacement rate, brightens the spotlight for migration as an important part of a solution (United Nations Population Fund 2023).

Being such a consequential matter, migration became the topic of inquiry across an increasing number of scientific disciplines, whose output (studies published as books, research papers, policy reports, and increasingly online platforms) are so vast in number that it has called for the development of a migration research database; the so-called “Migration Research Hub”, available at https://migrationresearch.com/. Established under the auspices of the European Union, the database has the purpose of aggregating, maintaining and sorting the global migration research output in one place, for the policymakers and researchers to be able to keep track on the evidence on migration as basis for their evidence-informed policies and research. While initially perceived to fall solely under the realm of Social Sciences, “Migration Studies” have evolved into a new interdisciplinary scientific field that encompasses a wide range of disciplines and methodologies (Pisarevskaya 2020).

In this study, we focus on two identified distinct substrates of Migration Studies. The first, the so-called “why” substrate, essentially deals with the identification of factors (often termed as “drivers”) of human migration, taking migration flows or migrant stocks as dependent variables, and quantifying the influence of independent variables on the generation of these flows. The independent variables for the models are pulled out from various baskets: economic (such as income and wages, GDP per capita, (un)employment rates, costs of living or moving), demographic (such as age, gender, population size, density or growth, household or marital status, education levels), geographic (such as physical distance, proximity or contiguity), cultural (such as language proximity, colonial relationships, religion), political (such as political freedoms or migration policies’ restrictiveness), and various other (such as the increasingly important environmental factors of floods and droughts, temperature and precipitation, natural disasters, various sorts of inequalities, levels of poverty, violence, etc.). All in all, a vast number of wider contextual, as well as individual factors, get evaluated for their (strength of) effect in driving or deterring migration across the studied geographies, with the primary evidencing method of these influences being regression modelling, often promoted as being the most reliable due to its reliance on quantitative data and rigorous inferential methodology.

The second, “how” substrate, focuses at telling how migration takes place, that is, at migration flows, migration patterns, or ultimately, networks of human migration flows. This substrate ties to the emerging field of Network Science, which increasingly penetrates into Migration Studies (as it does into virtually all domains). Network science (NS) deploys social network analysis tools—indicators, algorithms, models, and visualizations—to depict human migration flow patterns abstracted as complex networks from, again, quantitative data. Through its analytics and the accompanying visuals, NS tries to deliver a comprehensive view on the migration network behaviour. This allows social demographers, economists, and other experts (of the “why” substrate) to relatively easily identify the potential reasons underlying the discovered network patterns (Schon 2021; Pitoski et al. 2021d). Hence, this reduces complexity when it comes to selecting potential variables to deploy in, e.g., regression-based explorations. However, although the number of Network Science studies on human migration appears to be quickly rising in terms of percentage growth, their number (hence, the number of investigated territories) is still very small, while the analyses performed do not contain an epilogue, in the sense of providing more substantial explanations on the factors that underlie the identified network appearances. The network analyses essentially stop at providing the answers to the question how network patterns look like, and the two substrates remain detached from each other.

In the sequel, we present the real-world network analyses of human migration, spanning geographical scales from city to country to global level. For the city level, we provide a new empirical network analysis of inter-district migration in Vienna, Austria. At the country level, we provide a brief review of two recent NS-based studies on internal migration: one focusing on Austria (Pitoski et al. 2021b) and the other on Croatia (Pitoski et al. 2021c). For the global level, we provide another new empirical network analysis of the migration exchange of Croatia with the rest of the world. We compare migration network features identified at the three different levels and show how some pertinent network traits affect the deployment of regression models. Apart from offering the insight into migration network patterns in yet uninvestigated territories, we provide a thorough overview of up-to-date migration research output from the two addressed strands of research. We expose the gaps present in each individual strand, as well as the gaps that one reveals about the other. Our ultimate contribution is the set of substantiated propositions on how to deploy both fields simultaneously as to achieve methodological soundness, “making the most” from both, and aiding to future research.

In next section, we present the related work and the status quo in both research domains, along with their individual gaps. In Section “Network patterns of migration”  we examine migration-network patterns in the three aforementioned geographies. We thoroughly describe these networks’ features and provide comparisons between them with regards to the different geographical spans observed, with emphasis to the geographic validity of findings about their behaviour, or network consistency across these different levels. In Section “Discussion” we discuss the consequences of our findings, outlining the potential problems when two strands of research, and their methods in particular, do not converge. We close the same section by summarizing the main points, and offering our ways for improvement.

Related work

Migration factors studies

The factors driving human migration are examined over a massive amount of studies attaching to various migration-theoretical frameworks, such as the push-and-pull theory (Lee 1966), theory of migration systems (Mabogunje 1970), migration or mobility transition frameworks (Zelinsky 1971; Skeldon 1990), neo-classical migration theory (Harris and Todaro 1970), dual labour-market theory (Piore 1979), new economics of labour migration (Stark 1978, 1991), theory of cumulative causation (Massey 1990), or the recent aspirations-capabilities framework (de Haas 2021).

Throughout the studies, inquiring about migration factors generally involves a variety of methodological approaches. Surveys and questionnaires are widely used to collect data on migration aspirations and experiences, and these methods are most reliable as the persons interviewed provide their exact motivations for migrating. However, surveys are not convenient to be pursued on a large scale, which is crucial for inferring on the phenomenon as a whole, and subsequently deploying prediction models. Statistical methods, including regression modelling, structural equation modelling, time-series analysis, etc., and using for the models what is available of real migration data, help identify general trends and patterns, and causal relationships. Consistency in results varies, with certain factors such as economic opportunities, social networks, and political stability being widely recognized across different studies. Nevertheless, Migration Studies, although claimed to be “coming of age” (Pisarevskaya 2020) still build on migration theory which is “impasse” (de Haas 2021), in that much of migration research relies on simplistic push-pull models or neo-classical income maximization assumptions, which fail to explain real-world migration patterns. The studies predominantly employ regression models to identify the determinants of migration flows, primarily because of their robustness in quantifying the relationship between multiple variables and its ability to control for various confounding factors.

At the “congestion peak” in Migration Studies, where both the developed theories and the range of factors traced as probable determinants of migration have been accumulating to the point when systematization is desperately needed (see the points raised in Section “Introduction”), two valuable systematic reviews emerged. These provide a summary of the determinants of Aslany et al. (2021) migration aspirations utilized as the dependent variable in survey-based literature (49 studies using data from 1990 onward), and (Pitoski et al. 2021a) of the realized migration as the dependent variable used in (predominantly) regression-model-based literature; 163 studies using data from 1990 onward. The types of models used, the operationalization of migration as the dependent variable as well as migration factors as predictors, varies across the latter, but the multitude of definitions and means of measurement for migration and factors have been comprehensively outlined (Pitoski et al. 2021a). The studies were not limited to reviewing factors of international migration with countries as a whole as origins and destinations, but also include internal migration factors and their influence when it comes to migration between human settlements within countries (cities, towns and villages). Comprehensive visualizations that provide the insights on the established relevance of particular migration aspirations are available as figures in the aforementioned (Aslany et al. 2021), and the interactive Migration Drivers Map on the factors of realized migration working at different geographies, developed as part of Pitoski et al. (2021a), is available at https://tabsoft.co/3VzqXux.

The summary of aggregated findings of these works are as follows. The main determinants of migration aspirations that work as push factors from the originsFootnote 1 are found to be: violence and insecurity, previous migratory experience or belongingness to migrant communities (the so called “migrant networks”, sometimes also “migration networks” (not to be mistaken for the graph-theoretical concept of migration networks discussed and analysed in this articleFootnote 2)), young age, male gender, urban residency, higher educational attainment, and larger household size. The determinants of migration aspirations that have a negative influence—i.e. that keep migrants at origins are found to be: the sense of well being, employment, social attachment and participation, positive societal change over time, socio-economic status, being married or owning a home.

As regards the determinants of actually realized migration, the main push factors from the origins are found to be: educational attainment, unemployment, population size, previous migratory experience, male gender, and young age. The main pull factors toward destinations were found to be: belongingness to migrant communities or having families abroad, income levels and economic development (most often represented by GDP), population size, and social protection expenditures. In addition, the ubiquitously proven deterring factor of the origin–destination migration exchange, is geographical distance, while country/locational contiguity and language proximity are established as the spurring factors for the origin–destination migration exchange. The latter deterring and spurring (i.e., intervening) factors for the origin–destination migration exchange can be regarded as factors working at the link level in network-scientific terms.

Migration factors may be fitted under few general categories if migrants would be questioned about their reasons for migrating at destinations. The basic categories are arguably (i) migration for work, (ii) migration for education, (iii) migration for family reunion, and (iv) migration for seeking shelter from any kind of endangerment at the origin (e.g. asylum seekers and refugees). In addition, the reasons for migration can include those highly individualistic, such as the individual’s quest for new experience, or the attraction to the amenities of a specific location. Some of the data on the reasons for individuals’ reasons for actually migrating (the aforementioned basic categories) are gathered by some governmental institutions, and solely when it regards international migration (to the Netherlands, for example; see Central Bureau of Statistics of the Netherlands 2018). Yet, such data, and thus a clearer view on the balance between these factors for more geographies, is not available. For the Netherlands, it has been shown that the reasons are predominantly family reunion (around 50% of international migrants), followed by work or education (each category pertaining to about 20% of international migrants), and the rest (10%) pertaining to asylum-seekers, for the period of 1999–2010 (Schmeets 2019). Such efforts to collect the individual reasons or migration at destinations would complete the view on the relevance of factors that the scientific research has brought forward on migration aspirations, and the determinants of the realized migration.

Ultimately, what Migration Studies have managed to establish up to date is some approximation on the migration factor effects working on two scales, the international (inter-country) migration level and the internal (intra-country) level, basing these approximations predominantly on the realized migration, and predominantly on regression modelling. It is important to address here the fact that international migration has received much larger attention in migration factors research, although its size in terms of the number of migrants in global population is relatively small consistently over decades, at around 3.5% (or 272 million persons, estimated in 2019). At the same time, internal migration is about three times larger (763 million persons, estimated in 2013) McKenzie (2022). Moreover, data on the exact settlement-settlement relocations (i.e. specific city/town/village to specific city/town/village) are available only at the internal-migration level, while for the international migration the scientists are still able to work only with estimated country-level migration stocks (United Nations Population Division 2020) prompting the search for more accurate micro-level data sources for international migration (Tjaden 2021). The lack-of-data issue impacts the reliability of regression models when international migration between countries is taken as a dependent variable in regression models with the aforementioned factors deployed as predictors, control or instrumental variables. This is, however, just one of the problems when regression analysis is concerned. The other issues relate to the particular traits of migration networks, that we identify through several real-case network analyses in the paper. These subsequently expose the potential methodological flaws in regression applications. At this point we continue with the review of network science in the migration domain, followed by the analyses, while we return to the methodological issues in large detail in the Discussion (4).

Migration networks studies

A systematic literature review of studies that deployed the NS approach on human migration has recently been performed by Pitoski et al. (2021d). The study identifies 22 of such works published by the beginning of 2019. The network abstraction utilized across these works is such where nodes are locations (i.e. countries, counties, cities, municipalities), and link weights (or simply existence) are determined by migration flows between locations, or, more commonly, stocks of people of citizenship of particular country in other countries. Most frequently used indicators and algorithms used for the analysis of migration networks are shown to be: (weighted) degree and other variations of node (location) centrality, transitivity including node and network clustering coefficients, network assortativity, network modularity and community detection algorithms based on modularity optimization. The same review also identifies network-geographical levels that have been researched, where the predominant geography is the World (global international migration), European countries (intra-EU international migration) followed by the United States, China, UK and Mexico (countries’ internal migration). From the beginning of 2019, at which time the coverage of the aforementioned review ends, a relatively large number of new network analyses has emerged, which is a demonstration of an increased involvement of Network Science in Migration Studies.

Continuing with the same approach of Pitoski et al. (2021d) in terms of browsing of bibliographic databases to include new works in this summary (browsing ending with 30th May, 2023), we have identified the following: Bonaccorsi (2019), Aleskerov et al. (2020), Porat and Benguigui (2021), and Akbari (2021) analysed the global international migration network, Windzio et al. (2019) analysed international migration network of the countries of the European Union, Gürsoy and Badur (2021) analysed the inter-state migration network of the United States, Gürsoy and Badur (2022) analysed the internal migration network of Turkey, de Carvalho and Charles-Edwards (2020) analysed the internal migration network of Brazil and Chen et al. (2021) analysed the internal migration network of England and Wales. In addition, internal migration networks have been analysed in the two articles which we present in greater detail in Section “Network patterns of migration”; that of Austria (Pitoski et al. 2021b) and that of Croatia (Pitoski et al. 2021c). Our extended review has also identified new network analyses of (international) migration for specific population segments; sex workers (Rocha et al. 2022), refugees (Mourao 2020), and innovators (Wang et al. 2020). A very brief overview of the findings from these works can be summarized as follows: there are consistent small-world features, high network clustering, and robust community structures traced in all of the observed networks, international or internal. In networks of international migration, developed countries act as migration attractors in the hierarchy of attractiveness between countries, with distinct rich-club patterns.Footnote 3

More consequentially, what can be found by examining all the new works is in line with the conclusions of the former (Pitoski et al. 2021d) when addressing the nascent stage of the field; the same set of indicators and algorithms keeps being utilized in the analyses of migration networks. Still, little effort is invested into determining how to viably abstract these networks from data for a reliable application of these indicators and algorithms. This initial step of network abstraction is crucial for supporting any conclusions from the employed analytics (Pitoski et al. 2021d). While intense weights on self loops have been shown to exist in internal migration networks (Pitoski et al. 2021b, c), this feature, along with another prominent feature, of (weighted) reciprocity on links, tends to be regularly disregarded when abstracting migration networks from available data. It has been demonstrated in these former works that migration on self loops and high reciprocity disturb the application of conventional network algorithms and indicators, and that in order for these indicators to be straightforwardly applied, the network abstractions need to be established in such a way that they lose the least of the information that they originally bear. For example, the removal of looping edges, the aggregation of reciprocated links into undirected weighted links, and similar operations should be avoided, and such procedures scrutinized.

The suggested way forward for Network Science, in Migration Studies as well as in other domains, is to focus on space-time, or temporal—or dynamic—network abstractions, to obtain more meaning from the indicators and algorithms applied. Network dynamics have been scrutinized in some novel works, either by incorporating temporal aspects (Grba and Meštrović 2018) or by introducing dynamics through multilayer networks (Petrović et al. 2022). These networks represent extended versions of classical models, allowing for more credible modeling of dynamic phenomena observed in the real world. Still, the indicators and algorithms need to be upgraded to serve such abstractions, in the similar way as it recently has been offered by Pitoski et al. (2023a). Moreover, an even more fine-grained network abstraction in which nodes would be space-time positions of particular individuals (see Pitoski et al. 2023b).

However, considering the type of data currently available for scientists to abstract migration networks from, static weighted directed networks, with nodes as human settlements and links as within-period aggregated migration flows between them, currently is the only kind of abstraction that can feasibly be utilized, due to availability, but also due to accuracy and timeliness, of data being gathered and made available by collecting institutions (Pitoski et al. 2021b). This is how we also proceed in the analysis part of this article, commencing in next section.

Network patterns of migration

Network data, abstractions, and tools for the analyses

Section “Network patterns of migration” as a whole provides the analyses of human migration network patterns at different geographical levels. We start off by analysing migration network at the narrow geographical span: a city, and migration between its districts. We then proceed to analyse migration networks on an intermediate span: migration network within countries, between their cities/municipalities. We end by analysing the network on the widest geographical span: international migration between cities/municipalities of a country and the rest of the World. These analyses are done respectively for Vienna, Austria, and Croatia, based on the data acquired from each country’s national statistical office. References to the exact sources are provided in each respective subsection. In this subsection we provide the generic mathematical formulation for all analysed networks.

The abstraction of a migration network from the available data is a weighted di-graph \(\mathcal {G}=(\mathcal {N},\mathcal {L}, \mathcal {W})\), whose:

  1. 1.

    nodes \(\mathcal {N}=\left\{ n_1, n_2,\ldots , n_{N}\right\}\) are administrative subdivisional units: city districts, country cities or municipalities, and countries as whole, respectively of the observed geographical scale,

  2. 2.

    link weights \(\mathcal {W}=\left\{ w_{ij} \right\} _{N \times N}\), \(i,j=1,\ldots ,N\) (where i can be equal to j) are the counts of official changes of address of residence from a city district, municipality or country (node) i to a city district, municipality or country (node) j in any given year, and

  3. 3.

    links \(\mathcal {L}=\left\{ l_{ij} \right\} _{N \times N}\) is a binary projection of \(\mathcal {W}\), such that \(l_{ij}=1\) if \(w_{ij} > 0\), and \(l_{ij}=0\) if \(w_{ij} = 0\).

From this primary abstraction, in which the self-loops are taken into account (\(w_{ii} \ge 0\)), we may reduce our observations to subgraphs \(\mathcal {G}^{\prime }=(\mathcal {N},\mathcal {L}^{\prime }, \mathcal {W}^{\prime })\), where \(\mathcal {W}^{\prime }=\mathcal {W} {\setminus } \left\{ w_{ii}\right\}\) and \(\mathcal {L}^{\prime }\) is the according binary projection of \(\mathcal {W}^{\prime }\).

In each case we observe a static weighted directed network, in which the weights on links are represented by the aggregate yearly migration flows between node dyads respecting the direction of flows. As regards the network-analytical tools applied on these abstractions, these will be specified throughout each following case network analysis, but, overall, these comprise nodal indicators such as centrality measures (weighted degree, PageRank, HITS algorithm cenrality, etc.), and network-structural indicators, such as reciprocity, modularity, density, etc., including community detection algorithms. For the issues on data for the network abstractions and the choices on the metrics, we point again to the review of Pitoski et al. (2021d).

Intra-city migration network: Vienna

This subsection is dedicated to the network analysis of within-Vienna migration as a city-level case analysis (most narrow geographical span). The data from which we abstracted the network come from the Austrian Bureau of Statistics, and are publicly available at at: https://data.statistik.gv.at/web/catalog.jsp, “Population” section. The data used for this specific analysis are those containing migration flows between/within city districts of Vienna and between/within municipalities and cities in Austria (“Wanderungen innerhalb Österreichs”), while we also used the polygon data related to the administrative division into Vienna’s districts and Austrian municipalities (“Gliederung Österreichs in Gemeinden”) for the visualizations (deriving centroids of each polygon as geolocations). Specific years of observation are 2002, 2010, 2018 and 2019, and the multi-period observations were made in order to support the historic validity of the findings. The choice for picking these exact years is based on data availability and geographic validity that we extend to as the paper enfolds; Statistic Austria has consistent (published) data on migration starting from the year 2002, while the past analyses we focus in the following sections (covering Austria and Croatia) comprise year 2018 at minimum. We have included year 2010 as the period that stands between the two, and year 2019 is chosen as to see how large a change is between subsequent years.

The main focus of this section is intra-Vienna migration, but we begin by providing some general insights on its size relative to internal migration in Austria as a whole. This may serve as an introduction to the analysis presented in the following subsection, in which we expand the coverage to internal migration between all Austrian cities and municipalities, while the reader can also obtain the general impression on the size of the migration phenomenon in the country as a whole (Fig. 1).

Fig. 1
figure 1

Vienna-related migration in total Austrian internal migration, 2002–2019

The data behind the figure reveal that the size of “Vienna-related” migration is about 30% of the size of the total internal migration in Austria; from that overall percentage, about 22% pertains to exclusively intra-Vienna migrations, while about 8% pertains to migration exchange of Vienna with the rest of Austria. Roughly about 800,000 internal migrations in the country as a whole have been noted in the last period observed (year 2019), while the country is populated by approximately 8 million people; hence, internal migration as the share of the country’s population in general is about 10%.Footnote 4 Both the intra-Vienna migration and Vienna-rest-of-Austria migration grew relatively significantly throughout the observed period: 6.7% and 3.6% on average, respectively. This goes along with the generally increasing phenomenon of internal migration in the country, but Vienna-related migration increase seems to have been more rapid.

We further show, in Fig. 2, how Vienna connects with the rest of the country, addressing migration exchange with Austrian regions in particular.

Fig. 2
figure 2

Migration exchange of Vienna with Austrian regions, 2002–2019

A comprehensive (i.e., a network) view on the inter-district intra-Vienna migration flow patterns and their evolution over time, is provided in Fig. 3. As the graphs of the complete network (\(\mathcal {G}\)) clearly show, there is intense migration on self-loops (intra-district migration), while there is intense reciprocity on district dyads (inter-district migration). Total weights on self-loops comprise about 28% of all intra-Vienna migration, per each of the observed periods except year 2002, when this percentage was much lower (about 16%). This growth of “looping” migration appears to follow, with some lag, the growth of population in the city in general; namely, Vienna’s growth in population compared to each next investigated period (population at years’ end 2002 vs. 2010, 2010 vs. 2018, and 2018 vs. 2019), according to [Magistrat der Stadt Wienn - Stadt Wien Wirtschaft, Arbeit und Statistik 2020], was about 11%, 9% and 2.5% respectively. As for the migration flows that are not distributed on self-loops, these are distributed on an established set of district dyads. The specific structure of the intra-Vienna migration network is “fixed“; with time, the graphs don’t seem to progress towards completeness, but the weight of migration tends to be distributed, and growing, on a unique set of links.

Fig. 3
figure 3

Evolution of the internal migration network of Vienna, 2002–2019. Direction of migration is represented by clockwise curvature of links. Link thickness proportional to the size of migration (see labeled edges for a general orientation). Communities detected by Louvain algorithm modularity optimization algorithm (Blondel 2008). District names: 1—Innere Stadt, 2—Leopoldstadt, 3—Landstrasse, 4—Wieden, 5—Margareten, 6—Mariahilf, 7—Neubau, 8—Josefstadt, 9—Alsergrund, 10—Favoriten, 11—Simmering, 12—Meidling, 13—Hietzing, 14—Penzing, 15—Rudolfsheim-Fuenfhaus, 16—Ottakring, 17—Hernals, 18—Waehring, 19—Doebling, 20—Brigittenau, 21—Floridsdorf, 22—Donaustadt, 23—Liesing

Weighted reciprocity, measured as proposed by Squartini (2013) on \(\mathcal {G}^{\prime }\) for the year 2018, is calculated at 0.758, which roughly means that for every 100 people moving from district A to district B of Vienna in 2018, there will be, on average, about 76 people moving from district B to district A in that same year. Year 2018 has been selected to match the analyses of internal migration in the country(s) as a whole, which follow in next subsection. Limiting to this one year only, we believe, does not fringe the historic validity of findings, especially given the fact that the values for the Pearson correlation coefficient between the weights on respective inter- and intra-district links in any two years compared (thus, including 2018) never has shown to fall below 0.98 in our calculations.

As final part of the intra-Vienna migration network analysis, embedded in Fig. 3, we ran the Louvain community detection algorithm (Blondel 2008) on the undirected abstraction of \(\mathcal {G}^{\prime }\), which exposed three distinct district communities: the districts in the South-West, those in the North-East, and those of the city centre. These specific communities continue to be more strongly integrated relative to the city network as a whole, consistently through the overall analysed period. A potential explanation of why we find these specific community formations is that the migrants’ preferences for relocation are determined by the individuals’ familiarity or some kind of a personal attachment to a specific region or part of the city.

Intra-country migration networks: Austria and Croatia

Internal migration in Austria

This, as the subsection that follows, reviews the previous NS studies that have been performed for migration in a country as a whole. Here we summarize the findings on the internal migration network with links (weights) being migration flows between all Austrian municipalities (Pitoski et al. 2021b), as an expanded view from the city to a country scale. The summary highlihgts only the concepts most related to the previous analysis section, for comparability. The reviewed study has thoroughly covered the network’s evolution (each year from 2002 to 2018), explaining the inherent relationship of population and migration, and elaborating on the general patterns of migration, all while regarding the feasible network theoretical tools given the specific network structure. We invite the reader to assess that work for details and discussions on the calculated network metrics, as well as for the comprehensive network visualizations. An interactive visualization of the Austrian internal migration network in 2018 is available at: http://bit.ly/3VnkYcQ.

At the wider geographical scale (internal Austrian migration), extending from the narrower, city scale (intra-Vienna migration), the reviewed work shows some of the most prominent network features hold: the very high share of weights distributed on self-loops, as well as very high weighted reciprocity. Migration on self-loops took about 50–55% of all relocations in any year of observation; most of the migration at the country level is actually the relocation from one to the same city or municipality. Weighted reciprocity of inter-settlement relocations, was measured consistently at about 0.60 (60%), which roughly means that for every 100 people moving from municipality A to municipality B in Austria, there will be, on average, about 60 people moving from municipality B to municipality A in the respective year of observation. We remind that the measured values respectively for the “looping” and reciprocal migration were measured at 28% and 76% or the intra-city (Vienna) network. The tendency of people to relocate within the one and the same district of a city appears to be lower at the smaller geographical scale, which seems natural as there might be less options to find a new living space in such close proximity to the existing. Yet, the high value of reciprocity and the localized distribution of communities, as outlined in the previous section, still reflects the people’s tendency to relocate in and around “familiar” locations. In a sense, reciprocity too can be considered as looping migration within a confined geographical space. The addressed prominent network characteristics are shown for the case of Austria, and these appear to be similar at another country example, as in the review highlights of the study we provide next.

Internal migration in Croatia

A network study comparable to the previously reviewed (Austria) that used the same set of network indicators, algorithms and visualizations drawn on the same kind of network abstraction (see “Section Network data, abstractions, and tools for the analyses”), has been performed to analyse internal migration in Croatia (Pitoski et al. 2021c). By reviewing another country case we wanted to highlight the consistencies of the demonstrated internal migration network behaviour. The study on Croatia examines migration flows in the country in a single year (2018), based on the available data obtained from the Croatian Bureau of Statistics (Pitoski et al. 2020). The analysis also concentrates on the feasibility of network theoretical tools throughout their application, given the specific network structure examined. The reader is invited to assess the aforementioned study on Croatian inter-settlement network for the calculated network metrics and visualizations. An interactive visualization of the Croatian internal migration network in 2018 is available at: http://bit.ly/3OK1tsg.

What is specific for Croatia is that the total weights distributed on self-loops are found to be much lower than that in Austria; about 20% of all migrations in that one year observed. Also, weighted reciprocity tends to be a bit lower, 49%. In the same manner of explanation as before, we may say that roughly, for every 100 people moving from municipality A to municipality B in Croatia, there will be, on average, about half as many people moving from municipality B to municipality A, in the respective year of observation. When \(\mathcal {G}^{\prime }\)s of both countries for the respective period of 2018 were compared, a much stronger exchange was found between virtually all relatively more populated cities/municpalities and the capital city Zagreb. Less return flows to cities/municipalities were seen in specific regions, most prominently the region of Slavonia in the country’s East, all of whose large cities send many more migrants to the country’s capital (Zagreb) then they receive back from the same capital. These specifics may be explained by more business opportunities existing in the capital, which is inhabited by one third of the entire country’s population (according to the recent census pursued in 2021 and published by the Croatian Bureau of Statistics Central Bureau of Statistics Croatia 2021). In addition, by the same census and from the new data which we present in detail in Section “International migration network: Croatia and the World”, it is clearly visible that Croatia has extremely high emigration rates (and from the country’s east in particular). This perhaps also explains why the internal migration as a phenomenon in Croatia in general, as compared to that in Austria, was found to be much lower (around 2% for 2018, as measured by the sum of total migration weights as the share of total population).

Notwithstanding some of the precise country specifics, from both reviews it can be derived that the general migration network patterns at the country level are consistent with those at city level, with migration on self loops, and especially weighted reciprocity being their prominent characteristics. We believe it is very likely that one would find similar patterns if one would run the analysis on an intra-Zagreb (inter-district) migration network. However, we were not able to attain the data required to perform that analysis, as Croatian Bureau of Statistics, who were approached, do not maintain this data at the district level. Although we are aware that missing out city level comparisons may be perceived as a limitation, we believe this level of coverage should be satisfactory, considering that we trace similar patterns when widening the analysis to even larger spatial levels, as demonstrated in next section.

International migration network: Croatia and the World

In this, final part of our analysis, we broaden our focus to the worldwide scale with the attempt to explain, and verify, whether migration network- structural features hold, or how much they differ, from the previously depicted on the smallest and the intermediate scale. We examine a new case network, comprising the migration flows between World countries and Croatian settlements (the latter represented by 554 Croatian cities and/or municipalities). Migration data, obtained upon request from the Croatian Bureau of Statistics, comprise the total number of immigrants (irrespective of subcategories), per each year in the period from 2016 to 2021, per Country citizenship of immigrants, and vice versa, the number of emigrations of Croatian citizens who registered in settlements in different World countries in the same periods. Thus, the nodes of the network as defined in Section “Network data, abstractions, and tools for the analyses” are in this case the Croatian settlements and the settlements of a particular country with which there is migration exchange, where the latter are contracted to single nodes (countries). The fact that we include on the one side (Croatian side) the precise settlements, while on the other side the countries as a whole, is because of the lack of information on specific settlements for immigrants to Croatia prior to their move, or for Croatian citizens who registered in a different country after emigrating from Croatia within the same year. Nevertheless, this country-node contraction does not lead to a loss of generality when it comes to assessing the oveall network features, especially those traced in previous sections.

In Fig. 4 we present a screenshot of the analysed migration network which is available as a comprehensive interactive visualization at https://bit.ly/CROMIG (the \(\mathcal {G}^\prime\) for this case analysis). The visualization may be particularly useful for Croatian policymakers, as they can quickly identify the links and nodes where migration exchange, particularly the higher levels of emigration, is more pronounced. We invite the readers, particularly the policymakers, to examine also the interactive visualization provided at https://bit.ly/IMMCROCOUNTIES, where emigration, immigration and total migration are shown at the level of the 21 counties in Croatia. Among other things, the map shows counties (regions) in Croatia that have alarmingly high emigration rates, summarizing the top emigration destinations. Geolocations (latitudes and longitudes of Croatian cities/municipalities and countries’ capitals) are obtained from various free web services, and have further been validated by using Google Maps (https://www.google.com/maps).

Fig. 4
figure 4

International migration to/from Croatia, total 2016–2021. Countries of immigration/emigration (citizenship countries of migrants) represented by country capitals. Link thickness reflects migration total sum of weights on links in the period. Interactive map available at http://bit.ly/CROMIG

In terms of the nodal metrics, considering that we examine a sub-network of the wider migration network (as we only observe migration exchange between Croatian settlements and settlements in other countries, excluding exchanges involving settlements from other world countries), it is feasible only to examine the weighted in- and weighted out-degree centrality of Croatian settlements per each year, which can be inferred from the interactive visualization (see separately immigration and emigration, or the total tab). We also cannot quantify migration on self loops as we do not have data on weights on self loops in settlements in other countries. What we generally can say for looping migration on the international level, is, to repeat the estimations referred to in the introduction, that internal migration is generally three times larger phenomenon than international migration (McKenzie 2022). That means, essentially, that the sum of migration weights on self-loops at the intra-country level is three times higher than the sum of weights on all migration links connecting settlements from different countries, when global migration is observed. For Croatia, in 2018, the migration exchange with the world was 65.544 persons, while internal migration was 71.703 persons, but the latter number should be taken with reservations due to the proably understated intra-settlement migration figures (see the discussions in Pitoski et al. (2021c)).

What we can do reliably is assess weighted reciprocity, as another measure of particular interest, in line with its significance established from former observations at the city-scale and country-scale networks. In Fig. 5 we provide a glance into the reciprocity on the first 830 links of 3582 links in total, on which there was significant international migration exchange (20% of links whose weights cover for 80% of all migration, in line with the ubiquitous Pareto’s principle). The flows (in black) are weights on links going from Croatian settlements to other World countries, and the counterflows (in grey) are the weights on counterlinks returning from the same countries to the same Croatian settlements. For each year, the links are shown in the descending order in terms of the total weight realized on both link and counterlink in the whole period.

Fig. 5
figure 5

Reciprocity of migration between Croatian settlements and World countries, 2016–2021. In the chart above, each new period is started with the link with highest total weight (flow) and counterweight (counterflow) in the whole period (Zagreb-Germany, Germany-Zagreb), proceeding in descending order in terms of total weight. See text for more explanation

The reciprocity trend is such that from 2016 to 2018 it averaged around 23%, then dropped slightly around 2019 to 20% (probably related to the COVID crisis), and then rapidly increasing in the last few years, averaging at 35%. Notably, reciprocity is even higher for the top 830 links (weights), where towards the end of the same period it climbed to about 49%. We remind that this is the same value as the value calculated on the internal migration network the country as a whole (see Section “Intra-country migration networks: Austria and Croatia”).

Conclusively, when the subnetwork of migration between human settlements in Croatia and other World countries is observed, and sticking to our manner of representation when referring to reciprocity, we can say that for every 100 people moving from city or municipality A in Croatia to some location in country B in the most recent year, on average we will find at least 35 people moving from the same country B to the same city/municipality A in the same period.

Discussion

For the beginning of our discussion, we can summarize conclusions of our analyses which relate to migration network features in general, which is assisted by the following Fig. 6.

Fig. 6
figure 6

Characteristic migration network patterns by geographical scale

Based on the cases investigated (to the extent that we can claim geographical, historical, and external validity in general) we show that migration networks are characterized by pronounced self loops and pronounced reciprocity on all geographical scales. These traits vary across different geographical scales in the following way: at more narrow geographical scales (intra-city migration networks) we may typically find lower intensity of migration from one to the same location against a higher share of reciprocal migration between any two different locations (locations here being represented by city districts). On intermediate geographical scales (intra-country migration networks) one will find both of these features to be about of the same strength; about half of all migration will be from one to the same location, while the other half will be the reciprocal exchange between the “established” location dyads (locations here being represented by country municipalities or cities). At the global scale, one may expect to see higher within-location migration against between-location migration (locations again being represented by municipalities/cities), although reciprocal flows between established location dyads are still very much pronounced.

Now what does this mean for the analysis of migration factors, which underlie these network formations? As the literature review of the factors of realized migration summarized in Section “Related work” shows, in literature we will typically find the following generalized regression model, by which we infer on the migration factors’ influence:

$$\begin{aligned} M_{i,j} = \beta _0 + \sum _{k=1}^{n} \beta _k F_{i,k} + \sum _{l=1}^{n} \beta _l F_{j,l} + \sum _{m=1}^{n} \beta _m F_{i,j,m} + \varepsilon \end{aligned}$$
(1)

where

  • \(M_{i,j}\) represents the dependent variable denoting migration from origin i to destination j,

  • \(F_{i,k}\) and \(F_{j,l}\) represent origin-specific and destination-specific migration factors, respectively,

  • \(F_{i,j,m}\) represent an origin–destination link (intervening) migration factor, and

  • \(\beta _0\), \(\beta _k\), \(\beta _l\), \(\beta _m\), and \(\varepsilon\) represent, respectively, the intercept, origin-specific factor coefficients, destination-specific factor coefficients, link factor coefficients and the error term.

Note that this generalized form varies across migration factor studies in numerous ways; for example, the regression-analysis based paper on reasons for internal migration in Austria that we touched upon earlier in the paper (Jestl et al. 2022) takes only the emigration rate as the dependent variable in the model. (For the numerous variable operationalization variants, refer to the work cited in Section “Related work”). Still, this is a good general representation on the most represented general model as proof technique for determining migration factor effects in literature.

Now, what we conclude to be problematic with this kind of modelling, based on the conclusions stemming from the network science approach, is that this typical model representation (and its variations) does not take into account any of the pronounced network characteristics discovered in the former sections; neither high weights on self loops, nor high reciprocity of origin–destination links. It is arguably the case that, if simultaneously within the set of values for the dependent variable (migration) and in the set of values for the independent or other-type variables (migration factors) we include both those values that pertain to origin–destination flows/factors and those that pertain to destination-origin flows/factors, the models become unreliable. In addition, if we know that looping migration is such a sizable phenomenon that occurs simultaneously with the sizable reciprocity, why is it that almost none of the models includes this aspect into the observation?

We can propose several suggestions on how to enhance these general models, but also network models, and integrate two fields in general; from those relatively obvious (and perhaps already deployed in some rare cases), to those novel and relatively demanding. The first and most obvious way on how to update the models is to take the non-reciprocated weights of migration flow on links as values of the dependent variable, which would deliver factors that underlie migration between locations. Morover, the factors that determine migration within locations (looping migration), must be introduced in the same models (as \(M_{i,i}\) and \(F_{i,i}\)) added subsequently to each side of equation, as a form of a robustness check. Inclusion of self-loop migration applies both to regression analyses and network analyses, as network scientists’ typical way of abstracting migration networks, as to statisticians up to date, has been simply a reduction to between-location migration (i.e. ignoring the migration on self-loops).

The second way on how to improve regression models is to incorporate network indicators as predictors or control variables in the models. These can be centrality indicators, or other-than-local indicators such as clustering or assortativity coefficient. Only a couple of studies have been performed that deploy this approach, upgrading from standard regression modelling (see, e.g. Windzio 2018; Windzio et al. 2019), yet these are again limited to between-location migration. The models should, optimally, simultaneously include centralities (e.g. node strength) or other coefficients that include within-location migration. However, as it has been clearly shown in Pitoski et al. (2021b, 2021c), most network measures fail when there are high levels of reciprocity (and looping) in the abstracted static (weighted) networks, and the approach in general becomes almost meaningless, due to the reduced feasibility of network measure applications. This feasibility is particularly lower for eigencentrality metrics and other indirect-connectivity based measures such as PageRank, HITS centrality, transitivity or assortativity coefficients, while community detection is also very likely biased.

All these obstacles in both statistics and network science can be addressed with space-time or dynamic network abstractions before running any models; the issue we already touched upon in Section “Related work”. This means observing nodes as space-time positions (of a person or of a “batch” of people) moving from one location to the same or a different location over time. For the generalized model of calculation of centrality in such abstracted network, which centrality can actually be used as a dependent or other kind of variable in a regression model, see Pitoski et al. (2023a). Suitable data for space-time migration network abstractions is needed to make this possible, and more measures need to be upgraded to be applied on such abstractions. The recently launched database for the Netherlands (“the Dutch population Statistics Netherlands Microdata Catalogue”) which also contains space-time positions of people over years along with other personal attributes (see, for example, Bokányi et al. 2023), enables creating such space-time network abstractions before measure applications. However, dynamic networks measures development or adjustation, at the same time, need to get very intense, as this is a part of network science research which is severely under-developed (Pitoski et al. 2023a).

Ultimately, drivers of human migration may be incorporated directly in network science indicators, where network links are weighted by the values for migration factors in multilayered, and preferably dynamic (space-time) networks, on which subsequently the network measures are applied. For example, income differential between any two locations in the migration network can determine the link weight in one layer. Another layer could be the kilometer distance between locations. Language proximity or contiguity could determine the third or forth, binary network layers, and so on. Node centralities and other indicators calculated on such networks (calculated for each layer separately as well as cumulatively) could be used as dependent/independent/other-type variables in regression analyses.

Policy implications

All of the above suggestions point to the gap in understanding of migration factors and the application of network science in policymaking. While migration factors play a crucial role in predicting migration patterns at the micro level, the macro-level behavior of migration, analyzed as a network, offers a more comprehensive view and better predictive capabilities. However, policymakers often lack a comprehensive understanding of network dynamics and tend to rely on simplistic factors such as income or GDP as primary migration factors. This reliance on traditional factors may lead to incomplete policies, which fail to address the full complexity of migration dynamics.

It’s crucial for policymakers to acknowledge the specific traits inherent in migration networks, particularly the prevalence of looping and reciprocity at all levels. Understanding these dynamics is essential for accurately predicting future migration patterns. Looping migration, where the majority of movements occur within the same location, is a common phenomenon that policymakers often overlook. Similarly, internal migration, which arguably gets less mention in political discourses although being a massive phenomenon, can be viewed as a form of looping migration in the wider perspective. Internal migration, including that of international entrants who by our studies show to converge to the same further migration behaviour as domicile population after more intense initial mobility (de Hoon et al. 2021), has significant implications for regional development, labor markets, and social cohesion. Some of the recent findings on social cohesion based on extensive survey data (Schmeets et al. 2021; Schmeets and Exel 2021) show migrants tend to initially trust institutions but this trust declines over time, nearing native levels, while varying greatly among migrant groups. At the same time, indicators such as participation and volunteering rise the longer migrants stay. It is hence important to eventually incorporate network analysis in policymaking systems with preferably temporal network abstractions (Pitoski et al. 2023a) to capture the evolutional aspects.

Furthermore, policymakers must recognize the emergence of flow-counterflow balances on the established network links (in the longitudinal but static-network observations), whether these pertain to internal or international migration. In the context of international migration, when a link is established from one location in a country to another location in the same country, it evolves in such a way that it is not unidirectional. Instead, there is a predictive share of counterflow, where migrants (and not the same migrants) move back and forth between the linked locations. This dynamic nature of migration flows highlights the importance of considering both directions of movement when formulating migration policies and strategies.

By incorporating network analysis into policymaking, governments can develop more effective strategies for managing migration flows and addressing the complexities of modern migration dynamics. Network analysis offers policymakers a powerful tool for understanding the interconnectedness of migration patterns and predicting future migration, subsequently even identifying the key migration factors. Through the adoption of network science methodologies, policymakers can better anticipate migration trends, identify vulnerable populations, and design targeted interventions to support migrant integration and promote economic, demographic, cultural, and social development in general.

Limitations and future research

Despite its contributions, this paper is not without its limitations. Firstly, the analysis primarily focuses on migration patterns within a specific geographic context, namely Austria and Croatia, where data related to international migration to and from Croatia lack exact locations on the other end (in countries senders/receivers). This limited scope negatively impacts the generalizability of our findings to other regions or countries, with different socio-economic, cultural, and other contexts.

Moreover, while our paper highlights the importance of incorporating network science into migration studies, it does not provide a comprehensive integration of migration factors and network dynamics; migration factors research and the regression analyses that get most attention, have been discussed on an exploratory level. The proposed suggestions for future research require further validation and refinement, which will involve extensive empirical testing and cross-disciplinary collaboration. Additionally, the complexity of migration networks and the dynamic nature of migration flows present significant challenges in developing accurate predictive models.

All the above discussed points require a proof of concept based on a broader analysis, which we are to undertake in our future work. We take this study to be of good value in terms of revealing the gaps within the field of Migration Studies, and between Migration Studies and Network Science (applied on the migration phenomenon). Moreover, it provides a comprehensive overview of patterns of human migration, which is undoubtedly very useful for the policymakers of analysed countries, Austria and Croatia, but also wider. At this point, while migration is perhaps still a manageable phenomenon, the policymakers may incorporate the knowledge on migration patterns to finally start building policies that will handle migration which is destined to intensify, and have an impact on all societal aspects, in the very near future.

Availability of data and materials

All datasets generated and/or analysed in this study are available via the web links and references provided in the manuscript. All URLs and DOIs provided have been (re)accessed on 15th September 2024.

Notes

  1. In this representation, we reduce to the push-and-pull theory (Lee 1966) terminology for the simplicity of language.

  2. In NS research, the term “migration networks” often refers to a different concept than the social connections among migrants discussed here. Instead, it pertains to the mathematical analysis of migration flows and patterns as interconnected nodes and edges in a network graph. While both concepts involve migration, they operate on different levels: one focuses on social interactions and support structures, while the other examines migration as a network phenomenon, analyzing the flow of people between locations. We suggest the reader to consult the works of Gurak and Caces (1992) or McKenzie and Rapoport (2010) for further clarification of the concepts’ differentiation.

  3. Elaborating in more detail on the findings for each of the work would be infeasible in terms of increasing the length of the paper, thus we invite the reader to examine each of the specified work for the exact findings.

  4. In this study we do not deal with the important connection of population size and migration per each city/country examination, but point to the thorough demonstrations on these connections provided in Pitoski et al. (2021b).

References

Download references

Acknowledgements

Authors would like to express their special thanks to the fellow members of the Laboratory for Semantic Technologies at the Faculty of Informatics and Digital Technologies at University of Rijeka (FIDIT), the Laboratory for Complex Networks at the Centre for Artificial Intelligence and Cybersecurity, University of Rijeka (AIRI), the Dept. of Living Conditions and Social Cohesion of the Central Bureau of Statistics (CBS), office Heerlen, the Netherlands, and the Dept. of Political Science of Faculty of Arts and Social Sciences at Maastricht University (FASoS), for their comments that helped to improve the manuscript. Finally, we thank the anonymous reviewers for providing suggestions on how to further improve this manuscript.

Funding

This work has been supported by the Young Universities for the Future of Europe Alliance (YUFE, https://yufe.eu/), as part of the Postdoctoral Programme on the “Citizens’ Wellbeing” (call year 2021), and the University of Rijeka projects uniri-mladi-drustv-23–22-3082 and uniri-iskusni-drustv-23–95.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: D.P.; methodology: D.P.; software: D.P; validation: D.P., H.S. and A.M.; formal analysis: D.P.; investigation: D.P.; resources: D.P., H.S.; data curation: D.P.; writing-original draft preparation: D.P.; writing-review and editing: D.P., H.S. and A.M.; visualization: D.P.; supervision: H.S. and A.M.; project administration: D.P., H.S., and A.M.; funding acquisition: D.P., A.M. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Dino Pitoski.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pitoski, D., Meštrović, A. & Schmeets, H. The complex network patterns of human migration at different geographical scales: network science meets regression analysis. Appl Netw Sci 9, 35 (2024). https://doi.org/10.1007/s41109-024-00635-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41109-024-00635-1

Keywords