The change in research focus, from international to regional collaboration, observed in the previous section provokes a more general investigation of how knowledge flows (as proxied by academic collaborations) may have changed over time. In particular, we ask whether these trends have translated into an overall consolidation of regional ties, creating isolated clusters or pools of knowledge production.
To uncover the complex structure of these flows, we construct a network where the nodes are countries, and the edges correspond to the number of collaborations between countries i and j at time t, \(n_{ij}^{(t)}\), such that the network at this time has adjacency matrix, \(A^{(t)}\), with the corresponding i, jth entries.
Prior to further analysis, to immediately visualise significant partnerships, we follow a similar procedure to that proposed by Neffke and Henning (2013) for estimating skill-overlap between industry pairs based on inter-industry job transitions. The logic behind doing so is similar to that for revealed comparative advantage (RCA, see e.g. Balassa (1965)), in that measures calculated on the network formed by raw counts are typically dominated by those locations with the highest overall production (i.e. USA, China and similar). Instead, we normalise the observed counts by the capacity of each country, measured by total collaborations, using a configuration model-like approach, apply a transformation to help account for the spread of subsequent results, and finally apply a thresholding step. The details may be found in “Appendix 3”.
In Fig. 4, we display this transformed network, with edges with strengths given by Eq. (15), for the 5-year periods commencing in 1970, 1985, 2000, and for 2015–2018, where countries which belong to the same continent have the same colour, and the size of each country is proportional to their total number of publications within that time period. The spring algorithm ForceAtlas in Gephi is used to layout each network, and edges above a 0.5 threshold are shown.
We observe that countries tend to cluster together geographically in latter time periods. This can be seen with respect to the United Kingdom and Germany: in earlier time periods they occupy fairly central ‘positions’, but in the latter time periods locate more closely to other European countries. On the other hand, while we note that the rise of publication volume in China and India is visible particularly over the past two decades, the positions of these countries along with Japan remain relatively close to their regional groups. In Fig. 4e, we display the mean edge weights between the five continents in the form of aggregated adjacency matrices. We observe the emergence of a defined diagonal from the year 2000, while the off-diagonals grow paler. This indicates that intra-regional collaborations have strengthened, while the inter-continental collaborations appear to decline. Once again, we observe that in the most recent time period this trend may be beginning to change, with more intercontinental partnerships emerging.
In order to explore the increasing ‘regionalisation’ of research collaboration, we wish to extract information from the networks about groups of countries engaged in intense research collaboration across time. Exploring such groupings is a key focus of network science, known as community detection. Loosely speaking, this corresponds to a partition of nodes into communities for which within-community links are significantly stronger that between-community links. It is often found to be the case that these naturally arise in the real world, e.g. in social, neurological, or indeed academic networks as under consideration here (Newman and Girvan 2004). Here, such communities reveal groupings of countries which engage in significant research collaboration—and analysis of their evolution over time enables us to extract a quantitative description of the changing global research landscape.
While a variety of methods exist (see e.g. Javed et al. (2018) for an overview), the approach we take is that of optimisation of linearised stability (Delvenne et al. 2010; Lambiotte et al. 2008, 2011). Given a partition X, this method involves computing a sum of the deviations of the network edges within each community from a weighted configuration null model (where edges are shuffled randomly but node strengths are preserved). Mathematically,
$$\begin{aligned} Q_{{\text {conf}}}(X) = \frac{1}{2m}\sum _{i, j} \left\{ A_{ij} - \gamma \frac{k_{i}k_{j}}{2m} \right\} \delta (x_{i}, x_{j}), \end{aligned}$$
(8)
where
$$\begin{aligned} k_{i} = \sum _{j} A_{ij}, \quad 2m = \sum _{i} k_{i}, \end{aligned}$$
(9)
are respectively the strength of node i and total edge weight of the network, \(x_{i}\) is the community of node i (thus \(\delta (x_{i}, x_{j}) = 1\) if i and j are in same community and is zero else), and \(\gamma\) is a so called ‘resolution’ parameter. This final parameter controls the contribution of the null model to the sum, and so affects which partition will be optimal—larger values favour recovering smaller communities, and vice versa. Under the configuration null model, the expected strength of link between i and j is \(k_{i}k_{j}/2m\)—i.e., the total strength of node j times the probability of connecting to node i. In particular, using this null model, if \(\gamma =1\) linearised stability is identical to the conventional Newman-Girvan modularity (Newman and Girvan 2004). This linearised form is also effectively identical to another method previously introduced in Reichardt and Bornholdt (2006) for modularity at different network scales. This tuning parameter is highly useful, as it allows us to avoid to some extent the resolution limit that typical modularity has been shown to face (Fortunato and Barthelemy 2007), in that it is possible to fail to detect non-trivial small communities.
The principal idea behind stability is that if we follow walkers around the network, which jump between nodes with a probability proportional to the edge weight, then over time sets of nodes where walkers spend a prolonged period suggest denser connections within such a set than to outside, i.e. they form form a community. The period of time for which we track such walkers naturally leads to the resolution parameter \(\gamma\). More details on this are provided in “Appendix 4”.
In order to find a node partition X which maximises this function, a typical approach is to use a greedy algorithm by Blondel et al. (2008). This works by initially placing each node in its own community, then iteratively merging nodes with those adjacent to themselves if an increase in linearised stability is achieved. This process is stochastic in the sense that it may produce a slightly different optimum partition depending on the order in which nodes are ‘visited’. It is efficient as only local information (nearest neighbours) to the node is necessary at each step. Recently there has been a further improvement with a similar logic, known as the Leiden algorithm (Traag et al. 2019): this appears to result in higher linearised stability with lower computational cost, and so will be used here. Through studying the variation of information (a metric for comparing partitions) as described in “Appendix 4”, we find that two resolution times \(\tau =1/\gamma\) of interest are \(\tau =1.0\) (i.e. actually conventional modularity) and \(\tau =0.76\), which provides a finer-grained view of the network.
We display the best partitions \(X^{(t)}\) found from applying this optimisation process, with \(\tau =1.0\), to the network constructed for each time period in Fig. 5e. Following a similar approach to that of Pietilänen and Diot (2012) and Fagan et al. (2018), ’flows’ between two communities A and B are scaled according to the Jaccard index \(J(A,B)=|A\cap B|/|A\cup B|\). We first assign each community a colour arbitrarily, then compare adjacent time periods and retain the previous colour if \(J(A,B)>0.6\). The white community corresponds to countries outside of the time period under consideration. This figure contains a wealth of information, for instance evidencing that collaboration patterns often changed more regularly in earlier, more turbulent decades, before beginning to settle from 1995 onwards. It may be seen for example that Europe consolidates as a block at this scale from 1995 onwards, shortly after the formation of the European Economic Area (EEA). We observe that in the final time period, there are four communities which roughly correspond to the regions of Europe and Latin America, North America with China, Australia and nearby countries, and the rest of the world. The community of North America et al. may be an artefact of the USA and China being the two major global producers, and suggests that an alternative null model could be more suitable depending on the goal of analysis—we explore the deviation from the null model further in “Appendix 5”, but leave the development of such an alternative to future work.
In order to further investigate the rate of change of the modular structure over time, and the observed ‘regionalisation’ of research collaboration ties, we wish to quantify the similarity between each partition and its preceding partition, and between each partition and the ‘continental partition’ (where countries are assigned to a community based on continent). While the Jaccard index is good measure for comparing pairs of communities, to compare partitions we instead calculate the normalised mutual information (also known as the symmetric uncertainty (Witten and Frank 2002)). This is defined by
$$\begin{aligned} NMI(X,Y)=2\frac{\sum _{i,j} r_{ij}\log \left( r_{ij}/p_i q_j\right) }{\sum _k p_k\log p_k + \sum _\ell q_\ell \log q_\ell }, \end{aligned}$$
(10)
for two partitions X and Y, where n is the number of nodes, and \(p_{i}=|X_{i}|/n\) (the share of nodes in community i of X), \(q_{i}=|Y_{i}|/n\) (the share of nodes in community i of Y), and \(r_{ij}=|X_{i}\cap Y_{j}|/n\) (the share of nodes in both community i of X, and community j of Y).
We compare the partitions obtained in adjacent time-steps through calculating the normalised mutual information: i.e. \(NMI(X^{(t)},X^{(t+1)})\), where \(X^{(t)}\) is the partition obtained for time period t. In Fig. 5f, we display the values of this function over time at two different scales, with \(\tau =1.0\) shown solid, and the finer scale \(\tau =0.76\) shown dashed. We observe that after an initial period of change, recent years have seen relatively stable global research communities form at the finer scale, while at the more aggregate scale there is still some change (primarily due to splits in the large, ‘rest of the world’, community shown in purple). Next, we construct a new partition, C, which divides the world into five continents (communities): each country is assigned to their continent, i.e. Africa, America, Asia, Europe, and Oceania. In order to see how similar each partition is to this continental partition, we calculate \(NMI(X^{(t)},C)\) for all t. Figure 5g confirms what we had suspected from previous figures in that there has been a clear trend towards regionalisation of research ties at both scales, particularly between 1990–2010.
As a final check, we compare the stability of each detected partition to the stability of the continental partition. Since stability is a measure of partition quality, we would expect the stability of the continental to approach that of the detected partition in latter time periods. It is important to understand the difference in quality between these partitions, particularly as there is inherent randomness to the optimisation algorithm used, and it only guarantees convergence to a local optima. In other words, the ‘optimal’ partition we find could in fact be only marginally better than the continental partition in early decades, even if the partitions themselves were very different as measured by NMI. We cannot compare raw values of stability across time, as it varies with respect to network size/density etc.—as such, we compute the ratio
$$\begin{aligned} Q_{rat}(X^{(t)},C)=\frac{Q_{conf}(X^{(t)})}{Q_{conf}(C)}. \end{aligned}$$
(11)
The ratio of the stability scores tells us how well the geographic (or continental) partition ‘performs’ as a set of communities compared to those detected by our community detection algorithm. Figure 5h confirms that, as expected, this ratio declines over time. More specifically, we observe that the continental partition was of significantly lower quality in earlier time periods, particularly for the scale with \(\tau =1.0\), suggesting this was not a good ‘description’ of the network structure at that time. In later periods, the ratio approaches 1 (shown dashed red) at both scales, suggesting that the continental partition is increasingly a good fit for the network structure.