Author multidisciplinarity and disciplinary roles in field of study networks

Cunningham, Eoghan; Smyth, Barry; Greene, Derek

doi:10.1007/s41109-022-00517-4

Research
Open access
Published: 18 November 2022

Author multidisciplinarity and disciplinary roles in field of study networks

Eoghan Cunningham^1,2,
Barry Smyth^1,2 &
Derek Greene^1,2

Applied Network Science volume 7, Article number: 78 (2022) Cite this article

2673 Accesses
1 Citations
8 Altmetric
Metrics details

Abstract

When studying large research corpora, “distant reading” methods are vital to understand the topics and trends in the corresponding research space. In particular, given the recognised benefits of multidisciplinary research, it may be important to map schools or communities of diverse research topics, and to understand the multidisciplinary role that topics play within and between these communities. This work proposes Field of Study (FoS) networks as a novel network representation for use in scientometric analysis. We describe the formation of FoS networks, which relate research topics according to the authors who publish in them, from corpora of articles in which fields of study can be identified. FoS networks are particularly useful for the distant reading of large datasets of research papers when analysed through the lens of exploring multidisciplinary science. In an evolving scientific landscape, modular communities in FoS networks offer an alternative categorisation strategy for research topics and sub-disciplines, when compared to traditional prescribed discipline classification schemes. Furthermore, structural role analysis of FoS networks can highlight important characteristics of topics in such communities. To support this, we present two case studies which explore multidisciplinary research in corpora of varying size and scope; namely, 6323 articles relating to network science research and 4,184,011 articles relating to research on the COVID-19-pandemic.

Introduction

With the increased recognition of the benefits of multidisciplinary and interdisciplinary collaboration (Larivière et al. 2015; Okamura 2019), a trend has recently been established towards greater levels of interdisciplinary research (Leahey 2016). A common approach for understanding these research processes is through the lens of network analysis. For instance, given a corpus of research papers and their associated metadata, we can construct a variety of network representations to reveal different aspects of the underlying data, such as co-authorship networks (Feng and Kirkley 2020; Glänzel and Schubert 2004), citation networks (Karunan et al. 2017), and co-citation networks (Gmür 2003). These different representations can help us to identify collaboration patterns between individual researchers at a micro level. In other cases we might be more interested in examining collaboration patterns between researchers coming from different disciplines at the macro level. For example, we might wish to study how these patterns evolve over time in response to a changing research funding landscape or impactful exogenous events, such as the COVID-19 pandemic.

In this work, we propose a practical “distant reading” approach to help reveal collaboration patterns in large scientific corpora in order to understand better the nature and implications of these patterns. Distant reading has been used in other contexts, such as digital humanities, as a means of exploring large volumes of data from a macro level view, in order to identify specific areas of interest for closer examination (Moretti 2013). As the core contribution of this work, we present a novel graph representation, referred to as the Field of Study (FoS) network, which facilitates the investigation of multidisciplinary and interdisciplinary research in corpora of scientific research articles at the macro level. A key aspect of field of study networks is the use of author-topic relations. Specifically, a FoS network is populated by fields of study (or research topics), which are related to one another according to the authors who publish in them. In section "Methods" we describe how these networks can be constructed from the topics/fields of study that have been assigned to research papers. Later in section "Case studies" we describe two cases studies, which analyse the FoS networks arising from datasets of differing scope and size. The first case study in section "Multidisciplinary research in network science" relates to multidisciplinary research in the area of applied network science, while the second study in section "Author multidisciplinarity in COVID-19 research" pertains to the changing nature of author multidisciplinarity in response to the COVID-19 pandemic. These case studies demonstrate that FoS networks can provide a useful tool for the distant reading of large collections of research articles. In particular, we show how simple characteristics computed on a FoS network can highlight important topics in the research corpus. Further, we use community detection methods to identify specific multidisciplinary schools within a larger body of research, and we conduct a ‘role analysis’ of the topics within these communities to understand the role that they play in multidisciplinary collaborations. Crucially, we demonstrate our methods using datasets or varying size and scope, and, finally, we discuss some techniques that may be employed to drill-down on interesting interactions in the graph for further “close-reading”.

Related work

While a range of different definitions exist for multidisciplinary research, it is most commonly characterised as work which draws on expertise, data or methodology from two or more distinct disciplines. Most formal definitions distinguish interdisciplinary research as an extension of multidisciplinary research, which involves the integration of methods from the contributing disciplines (Choi and Pak 2006). There are numerous analyses which have explored multi- or interdisciplinary research, and investigated the relationship between different scientific disciplines. Many of these studies proposed metrics to quantify research interdisciplinarity, either at the author or at the paper level (Rafols and Meyer 2010; Porter et al. 2007), often in order to investigate a correlation between interdisciplinarity and research impact (Larivière et al. 2015; Okamura 2019), productivity or visibility (Leahey et al. 2017). Typically, works which integrate methods and ideas from a diverse set of disciplines are found to have greater research impact and visibility compared to those that do not (Okamura 2019; Leahey et al. 2017). Notably, there are several examples of works which have investigated cross-disciplinary collaboration, often drawing on representations and methods from network science (Feng and Kirkley 2020; Karunan et al. 2017; Wu et al. 2019; Raimbault 2019; Lafia et al. 2021).

Most frequently, co-authorship networks have been used as a means of representing the collaborations between different researchers, both in small-scale studies and when analysing large-scale bibliographic collections (Arnaboldi et al. 2016). In this type of network, researchers are represented by nodes and collaborations (i.e., articles jointly authored by a pair of researchers) are encoded by the undirected edges between them. Thus, research teams are identified as fully-connected components of the graph. In cases where research backgrounds can be identified among the authors in the network, this can be used to measure the extent to which authors engage in multidisciplinary collaborations. The analysis of co-authorship networks has often revealed a strong disciplinary homophily between researchers, despite the fact that those with diverse neighbourhoods in these networks tend to have a higher level of research impact (Feng and Kirkley 2020).

Another common representation used to investigate interdisciplinary research is the citation network, which is typically constructed at the article or journal level (Newman 2018). Analyses of citation networks can highlight influential or “disruptive” articles in interdisciplinary research (Wu et al. 2019), as well as “boundary” papers which span multiple disciplines (Karunan et al. 2017). Indeed community finding approaches have been employed to automatically group articles in citation networks into their respective fields of study (Raimbault 2019), so that interdisciplinary interactions can then be explored at the macro level.

An alternative strategy for analysing research collections is to apply text mining to article abstracts or full-texts in order to group articles together which relate to similar research themes, using techniques such as document clustering or topic modelling (Raimbault 2019; Lafia et al. 2021; Yau et al. 2014). This is typically based on word co-occurrence patterns, rather than based on article citation patterns. Of course, the patterns which emerge from textual analysis can be quite different from those generated using network-based approaches, as fields of study which are distant in their authorship or citation representations may still potentially be closely linked semantically, and vice versa.

Recent works have implemented heterogeneous network structures to represent authors and papers in the same network (Zhao et al. 2019; Zhou et al. 2007), using both citation and authorship links between entities. These complex networks offer a information-rich representation of a research corpus and can be viewed through multiple lenses (i.e. projections) to explore citation, co-citation, co-authorship relations.

Here we propose an alternative network representation, which relates fields of study according to the authors who typically publish in those fields. This network construction is intended for distant-reading large bodies of research to identify macro-level relations and collaborations between research topics. As such, a succinct representation is required and visualising networks containing hundreds of thousands of authors or articles is not feasible. In section "Methods" we describe the formation of an FoS network as a projection of a heterogeneous graph containing authors, articles and topics (or fields of study). In section "Author multidisciplinarity in COVID-19 research" we show that, on their own, FoS networks can provide an effective means of exploring large scientific collections, particularly in revealing aspects around author multidisciplinarity. In section "Conclusions", we discuss how the FoS projection may be implemented within a larger multi-layer network framework, as one lens through which collaboration can be viewed, (alongside established methods like co-authorship and citation analysis).

Methods

In this section we formalise the definition of a Field of Study (FoS) network and explain how such a network can be generated from existing research resources. In sections "Static FoS networks" and "Temporal FoS networks" we describe two FoS variations: the static FoS network and the temporal FoS network respectively.

Field of study networks

Formally, a Field of Study network is defined as a general graph representation of a collection of research articles (R), written by a set of authors (A), and denoted $F = (N, E)$. The nodes (N) represent identifiable research topics (i.e. the fields of study) and the edges (E) represent authorship relations between pairs of topics. These relations are aggregated across multiple associated research papers. Below we describe how a FoS network can be constructed from a more conventional authorship graph and we argue that FoS networks are particularly well-suited to analysing the nature of collaboration within the scientific literature, especially as they relate scientific fields of study according to the researchers/authors who publish in them.

The formation of a FoS network depends on the availability of appropriate fields of study labels for a given set of research papers. These could be derived via manual annotations by domain experts, the application of automated text mining methods, or some combination of the two. For instance, topic modelling techniques have been shown to be successful in extracting research topics from corpora of research articles and assigning papers to those fields (Lafia et al. 2021; Paul and Girju 2009).

In fact, many research databases and search engines employ these techniques (or manual classification) to assign articles or academic journals to fields of study. For example, the Microsoft Academic Graph (MAG)^{Footnote 1} maintains a deep hierarchy of Fields of Study which they assign to papers; Web of Science (WOS)^{Footnote 2} group journals in 258 Subject Categories; Scopus^{Footnote 3} employs experts to assign All Science Journal Classification (ASJC) codes to all journals covered by their index. For the purpose of the case studies described later in section "Case studies", we use MAG fields of study to categorise research papers and construct FoS networks. The deep MAG field of study hierarchy is desirable as it supports the construction of FoS networks at varying levels of detail, from the broadest research disciplines (level 0) to the specific topics and sub-topics that exist within a particular discipline (levels 4 and 5).

It is important to note that the Microsoft Academic Graph may not always be an appropriate source for field of study data. For instance, the corpus does not provide full coverage of all research disciplines and the corresponding hierarchy of fields may contain some spurious connections due to its size and the semi-automated nature of its construction. However, the methods that we propose are not specific to the MAG hierarchy. Rather, they are agnostic in the sense that they are designed to generalise to any case where fields of study can be identified at an appropriate level of detail.

Static FoS networks

The formation of a static FoS network from a collection of research articles is best described as the two-step process illustrated in Fig. 1. In the first step, an unweighted heterogeneous graph is generated from research articles, identifiable fields of study and their contributing authors; see Fig. 1a. Each article has authorship relations, linking the article to its author, and topic relations, linking the paper to its identifiable Fields of Study. In the second step, this graph is used to generate a projection (the FoS Network) in which fields are connected according to field-article-author-article-field meta-paths in the heterogeneous network. In other words, a weighted undirected edge exists between two fields if and only if at least one author has published research in both fields; see Equation 1 for all $a \in A$, where N is the set of fields identifiable in R. The resulting edge weights correspond to the number of such authors who publish in both fields (Equation 2).

$$\begin{aligned}{} & {} E = \big \{(n_i, n_j)~:~published(a, n_i)~\wedge ~published(a, n_j)\big \} \end{aligned}$$

(1)

$$\begin{aligned}{} & {} w\big (n_i, n_j\big ) = |\big \{a~:~published(a, n_i)~\wedge ~published(a, n_j)\big \}| \end{aligned}$$

(2)

Temporal FoS networks

It is further possible to encode temporal information in a FoS Network as directed edges, which allows us to study changes in multidisciplinarity research patterns over time. Temporal FoS networks can be visualised in a time-unfolded representation, where the data is divided into a sequence of two or more discrete time steps, as frequently employed in dynamic network analysis tasks. Field nodes are duplicated for each time step, so that papers can be connected to any fields in which they are published during the time period in which they are published.

As an example, Figs. 2a, b illustrate the two stages in the formation of a temporal FoS network, showing an instance of a temporal FoS network with respect to two time-points ($t_n$ and $t_{n+1}$) on either side of some event (e); thus $t_n<t_e<t_{n+1}$). The temporal FoS network in Fig. 2b contains a directed edge between two fields $(n_i,n_j)$ if an author published a paper in field $n_i$ at time $t_n$ (before event e) and another in field $n_j$ at time $t_{n+1}$ (after event e), as given by

$$\begin{aligned} E' = \big \{(n_i, n_j)~:~published(a, n_i,t_n)~\wedge ~published(a, n_j,t_{n+1})\big \} \end{aligned}$$

(3)

In the next section we present two illustrative examples which demonstrate the utility of static and temporal FoS representations, as described above.

Case studies

In our first case study, presented in section "Multidisciplinary research in network science", we consider the use of static FoS networks to explore aspects of multidisciplinary research in the area of network science. The second case study, described in section "Author multidisciplinarity in COVID-19 research", considers the use of both static and temporal FoS networks in the context of a large-scale dataset of research publications relating to the COVID-19 pandemic.

Multidisciplinary research in network science

Network construction Firstly, we focus on research published in the journal Applied Network Science (ANS)^{Footnote 4}, to use as a smaller case study with which we can highlight our methods. We choose ANS as it is a journal with multidisciplinary implications, and we consider the year 2019 as the period with the best coverage in our data source. Figure 3 presents two resulting static FoS networks, which we create to explore author multidisciplinarity in our data. These networks are produced using Microsoft Academic Graph metadata for 6,323 research articles. This set of articles represents 131 papers published in the journal Applied Network Science, supplemented by any additional research published by the same authors in the three years prior (2016–2018 inclusive). We use MAG fields of study metadata to categorise these research papers. The MAG uses hierarchical topic modelling to identify and assign research topics to individual papers, each of which represents a specific field of study (Shen et al. 2018). To produce a more useful categorisation of articles, we consider only those topics at the first two levels of the MAG hierarchy:

1
The 19 field labels at level 0, which we refer to as ‘disciplines’.
2
The 292 field labels at level 1, which we refer to as ‘sub-disciplines’

Thus, each article is associated with at least one discipline (e.g. ‘Medicine’, ‘Physics’, ‘Engineering’) and at least one sub-discipline (e.g. ‘Virology’, ‘Particle Physics’, ‘Electronic Engineering’). Note some MAG sub-disciplines belong to more than one discipline. For example, ‘Biochemistry’ is a child of both ‘Chemistry’ and ‘Biology’.

To center the FoS networks in Applied Network Science research, we include only those edges that originate from ANS papers. To perform role discovery using the Struc2Vec algorithm (Ribeiro et al. 2017), as an input we require a representation with unweighted edges. For this purpose, we apply weight thresholding to represent the FoS network as an unweighted graph. All subsequent analysis is completed on the unweighted graph produced with threshold 5, which corresponds to the mean edge weight in the original weighted network. In order to provide the clearest visualisations, we further prune the networks with threshold 10 before plotting. Figure 3a illustrates the resulting FoS network when network science articles are categorised at the discipline level. Each node (or discipline) in this FoS network can then be decomposed into its sub-disciplines, as shown in Fig. 3b.

Network characterisation From Fig. 3, we can begin to understand the multidisciplinarity of authors publishing in Applied Network Science, as the nodes represent a diverse set of sub-disciplines, coloured according to their parent-disciplines. Highly central in Fig. 3b are the fields which represent the technical and methodological foundations of network science research. Sub-disciplines of Mathematics and Computer Science, such as ‘Theoretical Computer Science’ and ‘Topology’, have high degree centrality (ranked 1st and 4th respectively), because they are identified across the majority of network science research papers. Modern network science methods, such as ‘Artificial Intelligence’, ‘Machine Learning’ and applications, such as ‘Information Retrieval’, have similarly high degree centrality (ranked 2nd, 3rd, and 6th respectively). Some fields beyond the disciplines of Computer Science and Mathematics, such as ‘Applied Psychology’, ‘Econometrics’, and ‘Neuroscience’ have high betweenness centrality in the FoS Network (ranked 3rd, 5th and 8th, respectively). This is likely because they represent interdisciplinary applications of network science published by authors who have backgrounds in other, more distant topics. For example, in the bottom of Fig. 3b we can see a group of medical fields which are linked to topics in Mathematics and Computer Science through ‘Applied Psychology’ and ‘Social Psychology’.

Community detection The MAG FoS hierarchy offers one possible definition of science’s traditional disciplinary taxonomy, grouping fields (or sub-disciplines) into broader schools of research. We can explore an alternative categorisation of the topics in the ANS graph by employing community detection methods. Figure 4 shows the network from Fig. 3b, but with the nodes colour-coded to show cluster memberships identified using the Louvain method (Blondel et al. 2008) (with resolution parameter value 1.0). This technique identified 7 clusters which maximise modularity in the graph, and group topics according to authorship relations. Table 2 provides descriptive statistics for the communities. Such communities represent multidisciplinary clusters of fields across which authors—in particular, those authors who contributed to ANS in 2019—are likely to publish. Louvain found clusters containing as few as 2, and as many as 26 topics. Broadly, the clusters can be categorised as: (i) central applied network science topics and applications (ii) networks in machine learning and neuroscience, (iii) psychology, biology and medicine, (iv) mathematics, statistics and natural language processing, (v) product development and process management, (vi) physics and economics, (vii) transport networks and microeconomics.

Role analysis In addition to categorising ANS-related topics according to (i) a traditional disciplinary hierarchy (Fig. 3b), and (ii) author-related communities (Fig. 4), it is also possible to group fields according to the “role” they play within the Field of Study network. Using the popular struc2vec algorithm (Ribeiro et al. 2017), we learn dense vector representations for the fields in the FoS network, which preserve structural equivalence between nodes. That is, nodes having similar structural features in the graph will have similar representations (commonly known as their role embedding) (Rossi et al. 2020). We then cluster the embedding space to identify a discrete set of disciplinary roles. Figure 5 illustrates the role assignments learned in the ANS graph according to a k-means clustering ($k=9$) of struc2vec role embeddings, where $k=9$ represents the elbow of the curve when silhouette scores are plotted for clusterings of increasing values of k. Table 1 shows the mean network centrality scores computed for the different clusters such that we can explain the roles that they represent. Fields in cluster #1 exhibit “hub-like” behaviour, as they score highly for all centrality measures. For each of the largest Louvain communities (i.e. excluding communities (v) and (vii)), the most central node was assigned to role #1. We will refer to these as the “core” nodes since they represent the fields most commonly identified in ANS research and are the most central in the FoS graph. Clusters #6, #7, #8 and #9 all represent peripheral/leaf nodes with degree 1 and very low centrality scores. None of the topics in the peripheral clusters can be identified in ANS published research. Instead, these topics appear in the 2016–2018 portion of the data and we refer to them as “distant background” topics.

Clusters #5, #4 and #3 are made up of increasingly prevalent background topics. Similar to the distant background roles, a majority of the topics in these clusters never appear in ANS research published in 2019. However, with greater degree than the more peripheral nodes, topics in clusters #5, #4 and #3 appear more frequently in author backgrounds. In the particular case of cluster #3, we identify a set of “ANS-adjacent” disciplines, i.e. the fields in which ANS authors publish the most readily. Finally, cluster #2 includes non-core topics that have high degree and betweenness centrality. The set of 9 fields in this cluster are separate to the dense communities at the core of the graph. Instead topics like ‘Applied Psychology’, ‘Computational Biology’ and ‘Regional Science’ link distant background subjects to the rest of the network. With all fields in cluster #2 represented in ANS research published in 2019, we anticipate that the research assigned to these topics offer multidisciplinary applications of network science research, published by authors with diverse research backgrounds. The roles identified in clusters #1, #2 and #6 are apparent in clusterings with 5 $\le k \le$ 10 (i.e., identical clusters are found for those parameter values).

Table 1 Roles identified in Applied Network Science Research and their mean network attributes, including centrality scores, cluster size (count), and the proportion of topics in the cluster identified in ANS research (ANS prop)

Full size table

Table 2 Louvain communities in ANS-related research and their size, network density, most central topics (according to degree centrality) and the most frequent MAG disciplines that are identified in them ($\ge 20\%$ of topics)

Full size table

Author multidisciplinarity in COVID-19 research

Field of Study networks generated on yearly data snapshots have been implemented to quantify author multidisciplinarity, according to the extent to which authors publish across different disciplines (Cunningham et al. 2021, 2022). They show a stable trend with author multidisciplinarity increasing year-on-year, with a much larger than expected increase for COVID-19-related research. In particular, these analyses grouped research topics (sub-disciplines) according to the MAG disciplinary hierarchy. In the following case study, we explore richer groupings of COVID-19 related research topics in an FoS network, to identify modular communities of sub-disciplines, and to explore their disciplinary roles.

Network construction Using a large dataset of COVID-19 related research—COVID-19 Open Research Dataset (CORD-19)^{Footnote 5}—we identify all authors who published COVID-19 related research in 2020, and collect MAG metadata for their COVID-19-related articles, along with any available articles that they published between 2016 and 2019, inclusive. This result is 4,184,011 articles, with 166,356 related to COVID-19. We then construct a FoS network using MAG sub-disciplines identified in the papers. Similar to the ANS example in section "Multidisciplinary research in network science", we consider the graph induced by only those edges which originate in COVID-19 research. That is, we do not consider authorship relations between the topics in the pre-COVID-19 portion of the data (2016-2019). Again, we apply thresholding to produce an unweighted graph where edges with weight greater than or equal to the mean edge weight (50) are preserved.

Community detection When applied to the COVID-19-related FoS network, the Louvain (Blondel et al. 2008) method (with resolution 1.0) identifies 7 communities, leaving 42 nodes unassigned to any community. Summary statistics for these communities are provided in Table 4. Community (i) groups the core topics in Medicine. It is a dense community with many authors publishing across almost all pairs of topics. ‘Surgery’, ‘Pathology’ and ‘Radiology’ are the most central fields. Community (ii) is more multidisciplinary than community (i). In addition to many medical fields (‘Intensive Care Medicine’, ‘Emergency Medicine’, etc.), it contains a number of sub-disciplines in Engineering (e.g. ‘Engineering Management’ and ‘Electrical Engineering’). As such, the authors who link topics in this community may represent those who tackled the medical emergency posed by the pandemic and, in particular, the challenges associated with the massive strain on intensive care units and relevant equipment like ventilators. Community (iii) clearly demarcates those topics relevant to the study of the socioeconomic implications of the pandemic. In addition to topics in Economics, this community links many sub-disciplines of Business and Sociology (e.g. ‘Financial Systems’ and ‘Demography’).

Topics in Biology, Chemistry, Physics and Engineering are linked in community (iv). As the largest and least dense of the communities, (iv) represents the many STEM research areas that are relevant to the study of epidemiology. ‘Virology’, ‘Immunology’, ‘Computational Biology’ and Pharmacology’ are among the most central sub-disciplines in community (iv). Community (v) contains topics relevant to Machine Learning and Mathematics and is likely formed as a result of the sizeable effort to apply machine learning and data science methods to detecting and tracking the spread of COVID-19 (Nguyen et al. 2021). Finally, communities (vi) and (vii) represent the smallest and most dense communities in the FoS network.

The topics in community (vi) relate to studies of the environmental impact of the COVID-19 pandemic and associated lockdowns, while nodes in community (vii) are related to ‘Astrophysics’. Further inspection of the sub-disciplines in community (vii) (‘Astrophysics’, ‘Astronomy’, ‘Classical Mechanics’, and ‘Computer Engineering’) highlights a portion of the CORD-19 dataset that is unrelated to the COVID-19 pandemic. We believe these papers were included in the collection in error. The modular FoS communities represent groups of topics which are strongly related according to the authors who publish in them. As such, these communities highlight the different schools/disciplines which emerged in COVID-19 research, together with the different research backgrounds and expertise with which authors contributed to them. Crucially, these disciplines offer an alternative classification of sub-disciplines to the more traditional MAG scheme, highlighting instead a more nuanced, multidisciplinary set of topics, specific to the pandemic.

Role analysis We also conduct a role analysis of the topics in the COVID-19-related FoS network, using the methods described in the ANS case study above. As before, we identify a discrete set of roles via k-means clustering of struc2vec role embeddings. We consider an optimal clustering to be the elbow of the silhouette score curve when plotted for increasing values of k. Consistent with the greater scope of the COVID-19-related dataset (when compared with that of the ANS dataset), we identify a larger set of clusters in the COVID-19-related FoS network ($k = 21$). Statistics for these clusters are provided in Table 3. Although the clusters appear more numerous and complex than in the ANS case study, a number of distinct roles are evident. We now discuss the predominant roles in turn.

The disciplinary hubs in the graph are captured in role #1. ‘Internal Medicine’, ‘Environmental Health’, ‘Virology’, and ‘Artificial Intelligence’ are clustered in role #1 as the core nodes in the network, with each topic among the most central nodes in the Louvain communities (ii), (iii), (iv) and (iv) respectively. Topics in role #4 have high betweenness scores, despite being outside of the most the central core of the graph (according to eigenvector centrality). Similar to role #2 in the ANS case study, these topics likely play a bridging role, linking otherwise disconnected topics to the rest of the graph. Role #4 contains topics such as ‘Economic Growth’, ‘Algorithm’, ‘Social Psychology’ and ‘Risk Analysis’, which all fall outside of the scope of virology or epidemiology. These topics occur in COVID-19-related research that is published by authors with research backgrounds that are more peripheral in the graph. We hypothesise that topics attributed to this bridging role occur in multidisciplinary applications of one or more fields to a external problem. A similar bridging role may be described by role #9, which has high betweenness centrality (ranked 4th), but relatively low eigenvector centrality (ranked 9th). With lower eigenvector centrality, it is unlikely that nodes in role #9 are adjacent to highly central topics in the graph and, as such, likely represent more peripheral “bridging” disciplines. Topics in role #9 are ‘Composite Materials’, ‘Computer Network’, ‘Atmospheric Sciences’ and ‘Climatology’. The largest cluster in the graph is role #15. With relatively low degree (mean = 3.2, median = 3) and greater eigenvector centrality than nodes with similarly low degree, it is likely that this cluster represents background topics which are adjacent to two or more of the more central COVID-19-related topics. Although the topics in this role are quite diverse, the cluster contains many sub-disciplines of Engineering, Chemistry, and Physics.

Through the role analysis outlined above, it is possible to further categorise the topics in the FoS graph according to the role they play within and between the disciplines described by the Louvain clusters. In particular, we identify those topics that (i) are at the core of the discipline(s) (i.e., hubs), (ii) represent multidisciplinary applications (i.e., bridges), (iii) are relevant in the research backgrounds of contributing authors (i.e., leaf or peripheral nodes).

Close reading For very large datasets, such as the COVID-19-related research explored in this case study, we rely on computational methods such as community detection and role analysis to understand the relationship between fields of study. Such methods describe the network structure and the multidisciplinary role of the associated research topics as we have shown. Additionally, these methods can highlight cases of multidisciplinary research which can be explored in greater detail. For example, Fig. 6 presents the FoS subgraph containing the topics in Louvain community (iii). We highlight these topics as they represent one of the larger, more multidisciplinary communities that were identified in COVID-19-related research dataset. This community groups many topics from the disciplines Medicine, Economics, Psychology, Sociology and Political Science. The authors who link these topics likely represent those who contributed research relating to the socioeconomic impact of the pandemic. Highly central in the subgraph are sub-disciplines of Medicine such as ‘Family Medicine’ and ‘Gerontology’ (the study of the social, psychological and biological aspects of ageing), in addition to non-Medical topics like ‘Economic Growth’ and ‘Demography’ (the statistical study of populations).

Many topics from Psychology and Economics are present in the more peripheral nodes in the graph, as are sub-disciplines of Mathematics and Computer Science (e.g. ‘Internet Privacy’ and ‘Statistics’) and even topics from Political Science (e.g. ‘Public Relations’ and ‘Public Administration’). The FoS subgraph helps to illustrate the highly diverse school of research that developed around the study of the socioeconomic impacts of the pandemic. We can further investigate the multidisciplinary nature of the research in this subset by using Temporal FoS networks to compare the pre-COVID (2016–2019) and COVID (COVID-19 related research in 2020) time periods. For example, we might ask the question – ‘What were the research backgrounds/expertise of the authors who published COVID-19-related research in the field of Economics?’.

Figure 7 presents COVID-19 related research in the field of Economics, with pre-COVID nodes on the left (representing the authors’ research backgrounds) and COVID nodes on the right (representing the FoS characterisation of the COVID related research). To highlight the strongest trends that exist, the FoS network shows only the top-30 edges by weight prior to thresholding. The multidisciplinary nature of this research subset is apparent in the diverse set of topics illustrated on the left hand side of the plot. In accordance with the broad spectrum of factors (social, political and economic) which influenced economic growth during the pandemic, we identify many authors who have published previously in sociology, psychology and political science in the graph. Additionally, those topics which may have useful, transferable skills such as ‘Statistics’ and ‘Data Science’ are also found to contribute.

To conduct further close reading, we can filter the list of articles by considering only those papers that contribute a particular edge to the FoS network. For example, we can search for COVID-related papers in the field-paper-author-paper-field meta-path between ‘Social Psychology’ and ‘Economic Growth’ in the time-unfolded heterogenous graph. These will correspond to COVID-related articles containing the topic ‘Economic Growth’, in which the authors have previously published research in the field of social psychology. To better understand the papers in this subset, we can explore the lower-level MAG topics that are most commonly identified amongst them, or the keywords which occur most frequently in their titles and abstracts.

Table 3 Roles identified in COVID-19 related research and their mean network attributes, including centrality scores and cluster size (count)

Full size table

Table 4 Louvain communities in COVID-19-related research and their size, network density, most central topics (according to degree centrality) and the most frequent MAG disciplines that are identified in them ($\ge$10% of topics)

Full size table

Conclusions

In this work we have demonstrated that our proposed Field of Study (FoS) networks provide a useful means of exploring author multidisciplinarity in a body of research. The two case studies, provided in sections "Multidisciplinary research in network science" and "Author multidisciplinarity in COVID-19 research", have shown the utility of FoS networks for this purpose in mid- ($\approx$6000) and large-sized ($\approx 5,000,000$) research corpora. Modular communities in FoS networks offer an alternative categorisation strategy for research topics and sub-disciplines, when compared to traditional prescribed discipline classification schemes. Such communities represent the broader, multidisciplinary trends in a body of research, together the different backgrounds and expertise with which authors contribute to them. Furthermore, role analysis, using methods such as struc2vec role embeddings, can be employed to parse the respective roles of topics within and between these communities. In particular, we have highlighted core and background roles, which serves to distinguish the central topics in a field from the background expertise of the authors. In addition, less central topics with high betweenness centrality may highlight multidisciplinary applications in the body of research. In the case of very large corpora, visualising FoS networks can be challenging. As such, in section "Author multidisciplinarity in COVID-19 research" we have outlined methods for drilling down to conduct closer reading of research corpora, at greater detail, using dynamic FoS networks.

There are a number of avenues for potential further research and application in this area. In particular, the heterogeneous graph from which the FoS network is projected could be further extended to include citation links between the papers. As such, the FoS representation could offer one of a number of different projections of the graph which could be studied in tandem, as the benefits of co-authorship and citation analysis are well established (Arnaboldi et al. 2016; Newman 2004, 2018). In general, we propose FoS analysis as a supplementary method for scientometric studies, where it is desirable to explore trends in multidisciplinary interactions at a macro-level.

Availability of data and materials

An archive of the relevant metadata for both case studies will be made available online before any final publication.

Notes

References

Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech: Theory Exp 2008(10):10008
Article MATH Google Scholar
Cunningham E, Smyth B, Greene D (2021) Collaboration in the time of covid: a scientometric analysis of multidisciplinary sars-cov-2 research. Humanities Soc Sci Commun 8(1):1–8
Google Scholar
Cunningham E, Smyth B, Greene D (2022) Navigating multidisciplinary research using field of study networks. In: Benito RM, Cherifi C, Cherifi H, Moro E, Rocha LM, Sales-Pardo M (eds) Complex networks and their applications X. Springer, Cham, pp 104–115
Chapter Google Scholar
Porter A, Cohen A, David Roessner J, Perreault M (2007) Measuring researcher interdisciplinarity. Scientometrics 72(1):117–147
Article Google Scholar
Rossi RA, Jin D, Kim S, Ahmed NK, Koutra D, Lee JB (2020) On proximity and structural role-based embeddings in networks: misconceptions, techniques, and applications. ACM Trans Knowl Discovery Data 14(5):1–37
Article Google Scholar
Wu L, Wang D, Evans JA (2019) Large teams develop and small teams disrupt science and technology. Nature 566(7744):378–382
Article Google Scholar
Zhao F, Zhang Y, Lu J, Shai O (2019) Measuring academic influence using heterogeneous author-citation networks. Scientometrics 118(3):1119–1140
Article Google Scholar
Gmür M (2003) Co-citation analysis and the search for invisible colleges: a methodological evaluation. Scientometrics 57(1):27–57
Article Google Scholar
Moretti F (2013) Distant reading. Verso Books, London
Google Scholar
Leahey E (2016) From sole investigator to team scientist: trends in the practice and study of research collaboration. Ann Rev Sociol 42(1):81–100
Article Google Scholar
Okamura K (2019) Interdisciplinarity revisited: evidence for research impact and dynamism. Palgrave Commun 5(1):141
Article Google Scholar
Raimbault J (2019) Exploration of an interdisciplinary scientific landscape. Scientometrics 119(2):617–641
Article Google Scholar
Rafols I, Meyer M (2010) Diversity and network coherence as indicators of interdisciplinarity: case studies in bionanoscience. Scientometrics 82(2):263–287
Article Google Scholar
Choi B C, Pak A W (2006) Multidisciplinarity, interdisciplinarity and transdisciplinarity in health research, services, education and policy: 1. Definitions, objectives, and evidence of effectiveness. Clin Invest Med 29(6):351–364
Google Scholar
Yau C-K, Porter A, Newman N, Suominen A (2014) Clustering scientific documents with topic modeling. Scientometrics 100(3):767–786
Article Google Scholar
Larivière V, Haustein S, Börner K (2015) Long-distance interdisciplinarity leads to higher scientific impact. PLoS ONE 10(3):0122565
Article Google Scholar
Karunan K, Lathabai HH, Prabhakaran T (2017) Discovering interdisciplinary interactions between two research fields using citation networks. Scientometrics 113(1):335–367
Article Google Scholar
Leahey E, Beckman CM, Stanko TL (2017) Prominent but less productive: the impact of interdisciplinarity on scientists’ research. Adm Sci Q 62(1):105–139
Article Google Scholar
Lafia S, Kuhn W, Caylor K, Hemphill L (2021) Mapping research topics at multiple levels of detail. Patterns 2(3):100210
Article Google Scholar
Arnaboldi V, Dunbar RIM, Passarella A, Conti M (2016) Analysis of co-authorship ego networks. In: International conference and school on network science. Springer, pp 82–96
Feng S, Kirkley A (2020) Mixing patterns in interdisciplinary collaboration networks: assessing interdisciplinarity through multiple lenses. arXiv preprint 2002.00531
Glänzel W, Schubert A (2004) Analysing scientific networks through co-authorship. Handbook of quantitative science and technology research, pp 257–276
Newman ME (2004) Coauthorship networks and patterns of scientific collaboration. In: Proceedings of the national academy of sciences 101(suppl_1), pp 5200–5205
Newman M (2018) Networks. Oxford University Press:Oxford
Nguyen TT, Nguyen QVH, Nguyen DT, Hsu EB, Yang S, Eklund P (2021) Artificial intelligence in the battle against coronavirus (COVID-19): a survey and future research directions. arXiv preprint 2008.07343
Paul M, Girju R (2009) Topic modeling of research fields: an interdisciplinary perspective. In: Proceedings of the international conference RANLP-2009, pp 337–342
Ribeiro LF, Saverese PH, Figueiredo DR (2017) struc2vec: Learning node representations from structural identity. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 385–394
Shen Z, Ma H, Wang K (2018) A web-scale system for scientific knowledge exploration. arXiv preprint 1805.12216
Zhou D, Orshanskiy SA, Zha H, Giles CL (2007) Co-ranking authors and documents in a heterogeneous network. In: Seventh IEEE international conference on data mining (ICDM 2007), pp 739–744. IEEE

Download references

Acknowledgements

Not applicable.

Funding

This research was supported by Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289_P2.

Author information

Authors and Affiliations

School of Computer Science, University College Dublin, Dublin, Ireland
Eoghan Cunningham, Barry Smyth & Derek Greene
Insight Centre for Data Analytics, University College Dublin, Dublin, Ireland
Eoghan Cunningham, Barry Smyth & Derek Greene

Authors

Eoghan Cunningham
View author publications
You can also search for this author in PubMed Google Scholar
Barry Smyth
View author publications
You can also search for this author in PubMed Google Scholar
Derek Greene
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

EC proposed the methods, implemented the code, and performed the experiments. EC, DG and BS wrote, read, and approved the manuscript.

Corresponding author

Correspondence to Eoghan Cunningham.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Cunningham, E., Smyth, B. & Greene, D. Author multidisciplinarity and disciplinary roles in field of study networks. Appl Netw Sci 7, 78 (2022). https://doi.org/10.1007/s41109-022-00517-4

Download citation

Received: 04 March 2022
Accepted: 06 November 2022
Published: 18 November 2022
DOI: https://doi.org/10.1007/s41109-022-00517-4

Author multidisciplinarity and disciplinary roles in field of study networks

Abstract

Introduction

Related work

Methods

Field of study networks

Static FoS networks

Temporal FoS networks

Case studies

Multidisciplinary research in network science

Author multidisciplinarity in COVID-19 research

Conclusions

Availability of data and materials

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords