 Research
 Open Access
 Published:
Orientations and matrix functionbased centralities in multiplex network analysis of urban public transport
Applied Network Science volume 6, Article number: 90 (2021)
Abstract
We study urban public transport systems by means of multiplex networks in which stops are represented as nodes and each line is represented by a layer. We determine and visualize public transport network orientations and compare them with street network orientations of the 36 largest German as well as 18 selected major European cities. We find that German urban public transport networks are mainly oriented in a direction close to the cardinal eastwest axis, which usually coincides with one of two orthogonal preferential directions of the corresponding street network. While this behavior is present in only a subset of the considered European cities it remains true that none but one considered public transport network has a distinct northsouthlike preferential orientation. Furthermore, we study the applicability of the class of matrix functionbased centrality measures, which has recently been generalized from singlelayer networks to layercoupled multiplex networks, to our more general urban multiplex framework. Numerical experiments based on highly efficient and scalable methods from numerical linear algebra show promising results, which are in line with previous studies. The centrality measures allow detailed insights into geometrical properties of urban systems such as the spatial distribution of major transport axes, which can not be inferred from orientation plots. We comment on advantages over existing methodology, elaborate on the comparison of different measures and weight models, and present detailed hyperparameter studies. All results are illustrated by demonstrative graphical representations.
Introduction
By the year 2050 twothirds of the growing world population are expected to live in urban areas (Bolay 2020). The sustainable planning and development of cities will greatly impact global economic, environmental, and social challenges, which demand interdisciplinary approaches to urban science (Acuto et al. 2018).
In recent decades, the interdisciplinary field of complex network science has provided models and methods, which today impact everyday life (Milgram 1967; Watts and Strogatz 1998; Barabási and Albert 1999; Brin and Page 1998; Page et al. 1999). Network modeling approaches for urban systems can be traced back almost 300 years (Euler 1741) and the abstraction of urban areas into geometrical models continues to form the basis of modern urban science (Porta et al. 2006a, b; Batty 2008; Barthélemy and Flammini 2008; Barthélemy 2011; Courtat et al. 2011; Chan et al. 2011; Barthélemy et al. 2013; Barthélemy 2016; Sharifi 2019). The combination of complex network models, urban science applications, and mathematical methodology from the field of numerical linear algebra is at the heart of this paper.
The main contribution of this paper is the development of a general methodology for two aspects of the spatial analysis of urban public transport networks. Exemplary results are given for major German and European cities. Figure 1 gives a schematic overview of the developed methods, which we describe further in the following paragraphs. This paper is accompanied by publicly available python implementations of all developed methods.^{Footnote 1}
The first aspect of geometrical city modeling studied in this paper is represented by the determination of orientations of complex urban networks, which reflect different urban organization patterns ranging from topdown planning to selforganization dynamics (Batty 2008; Barthélemy and Flammini 2008; Barthélemy et al. 2013). Earlier works focused on the study of orientations of street networks (Courtat et al. 2011; Chan et al. 2011; Gudmundsson and Mohajeri 2013; Mohajeri et al. 2013; Boeing 2019). Our contribution to this aspect of geometrical urban analysis is to provide methodology and example results of orientations of public transport systems. We compare these to street network orientations obtained with existing methodology (Boeing 2017, 2019) and find interesting relations between the orientations of the two modes of transportation for the largest German and several major European cities.
The second aspect studied in this paper is the identification and ranking of the most central nodes of urban public transport networks. Complex network scientists have intensively studied a variety of different centrality measures in recent decades. Among the most prominent examples are degree, betweenness (Freeman 1977), closeness (Freeman 1978), and versions of eigenvector centrality (Bonacich 1987; Brin and Page 1998; Kleinberg 1999; Page et al. 1999). Urban science has been one of many disciplines to pose the natural question of identifying and ranking the most important entities of complex networks (Crucitti et al. 2006a, b; Porta et al. 2006a; Scheurer and Porta 2006; To 2015; Nourian et al. 2016; Agryzkov et al. 2019; Hellervik et al. 2019; Hong et al. 2019; Curado et al. 2021).
Our approach to the computation of centralities of urban public transport networks relies on two main ingredients: matrix functionbased centrality measures and multiplex networks. The class of matrix functionbased centrality measures (Katz 1953; Estrada and RodriguezVelazquez 2005; Estrada and Higham 2010; Estrada 2012; Benzi and Klymko 2013; Benzi and Boito 2020) has attracted a lot of attention in recent years but has not been used in the urban science literature yet. These measures have the interesting property to interpolate between the concepts of local degree and global eigenvector centrality (Benzi and Klymko 2015). Furthermore, these measures require less assumptions on the graph structure than classical eigenvector centrality, which makes them applicable to a wider range of problems.
For the network modeling we rely on a special class of multilayer networks. Research on multilayer networks exploded since 2014 when two survey papers unified terminology and gleaned the main aspects of a field, which had before been studied across various disciplines (Kivelä et al. 2014; Boccaletti et al. 2014). Multilayer networks provide the means for building increasingly realistic models of complex systems as they allow entities to interact in different ways and on various levels. Urban science was one discipline to witness the rise of multilayer network modeling approaches in recent years (Strano et al. 2015; Barthélemy 2016; Alessandretti et al. 2016; Aleta et al. 2017; Zheng et al. 2018; Curado et al. 2021).
Some recent works provide generalizations of centrality measures wellstudied on singlelayer graphs to the case of different multilayer architectures (De Domenico et al. 2015; Taylor et al. 2017, 2019, 2021; Wang et al. 2017; Tudisco et al. 2018; Wu et al. 2019; Bergermann and Stoll 2021). Most importantly for this paper, the class of matrix functionbased centrality measures has very recently been generalized to the case of layercoupled multiplex networks. We adapt that methodology and study its applicability for the more general multiplex models of urban public transport networks. Highly efficient and scalable methods from numerical linear algebra enable the rapid and stable approximation of the centralities for small to largescale networks. We present various numerical results for multiplex networks with the number of physical nodes in the thousands and the number of layers in the hundrets and extensively study the influence of the different hyperparameters involved in the model.
Data
The focus of this paper lies on the analysis of urban public transport networks. For their realistic modeling we rely on the General Transit Feed Specification (GTFS),^{Footnote 2} which defines a standardized format for public transport timetables used around the globe. For comparison and visualization purposes we also consider street networks. Here, we rely on the python package OSMnx^{Footnote 3} (Boeing 2017) built on top of the open source platform OpenStreetMap.^{Footnote 4} OSMnx can, amongst other things, be used to generate graph structures representing street networks according to userspecified queries and provides various routines for their analysis. Note, that while OpenStreetMap also contains a public transport tagging feature, timetable information can not be expected to be as accurate as highquality GTFS data, cf. https://wiki.openstreetmap.org/wiki/Public_transport.
We consider the 36 largest German cities by population as well as 18 selected major European cities for which we could obtain complete GTFS data. The three German cities Bochum, Gelsenkirchen, and Magdeburg had to be excluded from our studies as no complete GTFS data was available.
For Germany, several high quality GTFS feeds are publicly available^{Footnote 5} on a daily basis and this paper builds on the local public transport data set^{Footnote 6},^{Footnote 7} for April 22nd, 2021 (feed version “light20210422”). This data set covers over 20,000 public transport lines serving over 450,000 stops across Germany. However, only required and conditionally required files (no optional ones) are provided,^{Footnote 8} e.g., no explicit information about route frequencies or transfer options between stops is provided.
We also obtained several individual GTFS data sets for 18 selected major European cities.^{Footnote 9} However, as data integrity and availability varies among the different feeds, no common feed date could be guaranteed for all data sets. Instead, we downloaded the latest complete data set available as of May 21st, 2021 in each case.
For all considered cities, we use polygons representing the administrative boundaries of the cities proper (excluding suburban areas) obtained via appropriate OSMnx queries to filter for all stops within the city limits. In GTFS terminology, all “routes” with at least one associated “trip” connecting at least two of the filtered “stops” are considered a valid line in the public transport network. In addition to the stops’ spatial coordinates, we process the fields “arrival time” and “departure time” of each “trip” from the file “stop_times” to determine travel times between connected pairs of stops.
The polygons used to filter public transport stops from the GTFS data sets are also deployed in all OSMnx routines used in the context of this paper. For our analysis, we require filtering for streets within each city’s limits for two purposes: firstly, to produce street network orientation plots and, secondly, to create and plot street networks for visualization purposes later in this paper.
The following two sections provide details on how the data is transformed into different graph structures. While a sequence of singlelayered graphs is sufficient for the determination of public transport network orientations we introduce a more sophisticated modeling approach based on multiplex networks for the determination of the most central stops and lines of the networks.
Public transport network orientations
Urban areas in large parts of the world have undergone rapid expansion in past decades and are expected to grow further (Bolay 2020). Global economic, environmental, and social challenges will be crucially impacted by future urban planning (Acuto et al. 2018). One of many important aspects of the growth of urban systems is their geometrical expansion (Porta et al. 2006a, b; Batty 2008; Barthélemy and Flammini 2008; Barthélemy 2011; Courtat et al. 2011; Chan et al. 2011; Barthélemy et al. 2013; Barthélemy 2016; Sharifi 2019). Previous works have identified very different mechanisms for the spatial evolution of cities ranging from topdown planning to selforganization dynamics (Batty 2008; Barthélemy and Flammini 2008; Barthélemy et al. 2013). A recent work beautifully illustrates these different mechanisms by means of orientation plots of urban street networks on a global scale (Boeing 2019).
In this section, we consider orientations of both street and public transport networks of the 36 largest German cities by population as well as 18 selected major European cities specified in the previous section. We describe the developed methodology in a first, and illustrate and discuss the obtained results in a second step.
Methodology
For the determination of street network orientations we use readily implemented routines^{Footnote 10} from the python package OSMnx (Boeing 2017) in combination with appropriate search queries, which follow the methodology described in Boeing (2019). This methodology relies on the creation of a primal (Porta et al. 2006a) undirected singlelayer graph in which intersections are represented by nodes and streets by straight edges between them (ignoring curvature). The compass bearings of both directions (always including the reciprocal of any street bearing) of each street segment are recorded and added in an unweighted manner, cf. Mohajeri et al. (2013) for a discussion on weighting techniques. Afterwards, they are divided into 36 equalsized bins with the cardinal directions located in the middle of the associated bins and plotted as a rose diagram.
Our first contribution to the geometrical analysis of urban systems is the investigation of orientations in public transport networks. To this end, we use GTFS data describing public transport timetables as specified in the "Data" section and we develop a methodology for the computation of public transport network orientations similar to that of street network orientations.
Our methodology also relies on the creation of singlelayered graphs and the computation of bearings between pairs of nodes. In our case, nodes represent stops within the city limits and edges represent connections of public transport lines between stops, which we also assume to be straight and which are also not weighted (by, e.g., segment lengths or travel times). Deviating from the street network case, we consider these edges to be directed. The edges’ directions are determined by the sequence of stops of, in GTFS terminology, each “trip” of each valid “route” of the urban public transport network. We realize this by successively processing all “trips” belonging to all valid “routes” as specified in the "Data" section. This procedure has the effect of assigning an implicit weight to all lines proportional to their operation frequency. An alternative approach would be to only process unique trips of all lines, which would, in the simplest case, lead to two copies of the same sequence of stops in reverse order. Interestingly, we empirically found only small deviations in the results of the two approaches, which we will comment on in the "Results and discussion" section. In both cases, most of the created directed graphs are chain graphs connecting only a small subset of the stops within the city limits. We then store the directed bearings (not including reciprocals) between all pairs of connected nodes of all graphs, divide them into 36 equalsized bins, and visualize the results in a rose diagram as in the case of street network orientations. Note that the computations of orientations of different public transport lines are entirely independent, which offers great potential for parallelization.
Results and discussion
The methodology for the computation of street network orientations described in the beginning of the "Methodology" section leads to the results displayed in Fig. 2 for the 36 German cities under investigation. The results show that most large German cities’ street networks have two orthogonal preferential directions, which is consistent with previous findings (Chan et al. 2011). How strongly pronounced these directions are, however, differs significantly: while, e.g., Halle (Saale) or Krefeld exhibit quite distinct crossshaped patterns throughout the city area other cities like, e.g., Bielefeld or Mönchengladbach show almost equally distributed orientations. These differences likely reflect diverse historic city developments and urban planning approaches. Furthermore, the preferential directions seldomly coincide with the cardinal directions, but can often be linked to geographical constraints such as rivers or mountains. Some cities like, e.g., Lübeck or Rostock appear to comprise of two sets of orthogonal preferential directions prevailing in different regions of the city area.
At first glance, one might expect public transport network orientations to strongly correlate with street network orientations as buses and usually trams, which dominate the public transport system in most German cities, cf. the layer column in Table 1, operate on or alongside streets. However, as public transport stops are usually not located at street intersections but on street segments one could interpret public transport networks as dual graphs (Porta et al. 2006b) of street networks in which nodes represent streets and edges represent intersections and in which only a subset of street segments is equipped with public transport stops. In the case of a line following the same street between two stops the orientations of both networks will coincide, but as soon as a public transport line takes a turn at a street intersection, the public transport bearing will point into a direction in between the two (often orthogonal) street orientations.
Figure 3 shows the public transport network orientations for the 36 considered German cities. In contrast to the street network orientations of the same cities in Fig. 2 the public transport network orientations lack clear orthogonal preferential directions for most cities. Instead, one often observes only one blurred preferential direction consisting of several bins of the rose diagram with no or only a weakly pronounced second orthogonal direction. Interestingly, this one blurred preferential direction often approximately coincides with the one preferential direction of the corresponding street network, which is closer to the eastwest axis. None of the considered German cities has its most pronounced public transport direction closer than 45 degrees to the northsouth axis.
Moreover, the blurred tilted eastwestlike preferential directions of the public transport networks are often accompanied by a visually rather discontinuous distribution of bearings in the remaining directions. As there is a tendency for more continuous bearing distributions for cities with many public transport lines we conjecture that part of the recorded orientations can be considered random, e.g., due to the effect induced by taking turns described above, which decouples public transport orientations from street network orientations. Cities with more lines and consequently a larger sample size of different bearings seem to get closer to the actual, relatively smooth bearing distribution. For instance, the bearing distribution of Mainz with only 39 lines appears much more discontinuous than that of Hamburg with 253 lines.
By the choice of considering directed edges in public transport networks the corresponding orientation diagrams are nonsymmetric. However, the degree of nonsymmetry in Fig. 3 is rather small indicating that most lines serve the same sequence of stops roughly equally in both travel directions.
Comparing the two approaches of considering all “trips” per “route” against considering only unique “trips” the orientation plots using only unique trips are visually almost identical to those displayed in Fig. 3 for almost all 36 cities. Very few cities like Freiburg and Krefeld, which both comprise of few lines contain somewhat less pronounced tilted eastwest axes, implying that the respective bearings belong to highly frequented routes.
For comparison, we also apply the methods described in "Methodology" section to 18 selected major European cities. Figure 4 shows their street orientation diagrams while Fig. 5 illustrates the corresponding public transport orientations. The key observations made for the German cities hold true for roughly half of the considered European cities: street network orientations contain two orthogonal preferential orientations, which can sometimes be linked to geographical constraints, and main public transport network orientations tend to run along the eastwest axis. Deviating from the latter observation, merely Nice clearly possesses a more pronounced northsouth orientation. Conversely, street orientations of, e.g., Athens, Rome, and Stockholm are quite equally distributed, which could suggest that selforganizing dynamics dominated the spatial expansion of these cities (Batty 2008; Barthélemy and Flammini 2008; Barthélemy et al. 2013). The fact that Athens and Rome both have a history of being important European cultural centers for millennia might support that hypothesis. Interestingly, some cities like Luxembourg City, Oslo, and Stockholm, whose street network orientations are quite equally distributed, still show a clearly pronounced (tilted) eastwest axis in the public transport network orientation plots.
Finally, we comment on some European cities with unique properties in at least one of the two orientation plots. The street network orientations of Brussels differ from other considered cities in the fact that the preferential directions are not orthogonal, but rather form an angle of 45 degrees. The public transport network orientations of Belgrade are remarkably nonsymmetric suggesting that a transport concept different from lines going back and forth along the same route with the same frequencies is in place. Lastly, Barcelona is the only city considered in this paper with two equally pronounced orthogonal preferential directions in both orientation plots. The latter observation is the result of a remarkable city planning effort by Cerdà in the year 1860 to connect the medieval core of Barcelona with surrounding villages by an extensive road network in the form of an orthogonal grid (PallaresBarbera et al. 2011), which is certainly only one of many stories behind the graphics in Figs. 2, 3, 4 and 5.
All results presented in this section are reproducible with publicly available python codes.^{Footnote 11} The code is designed according to the General Transit Feed Specification (GTFS) and should work on any valid GTFS data set making it easily utilizable for data of all regions around the world.
Multiplex network model
We now move to the second aspect of geometrical city modeling studied in this paper: the application of matrix functionbased centrality measures to a multiplex network representation of the urban public transport systems. To this end, we introduce the multiplex framework employed to formalize these systems. We discuss the relation between network centralities and orientations in the "Discussion" section.
Network models have a long history in urban science: going back almost 300 years to Euler’s Königsberg bridge problem (Euler 1741) it could be argued that graph theory was motivated by an urban science problem. Especially the abstraction of street networks to (almost) planar singlelayer graphs has been the basis for many mathematical models of cities ever since (Barthélemy 2011, 2016). More recently, following the advent of multilayer graphs in various scientific disciplines (Kivelä et al. 2014; Boccaletti et al. 2014), urban science profited from the flexibility provided by multilayered network structures to construct more realistic models of complex interactions (Aleta et al. 2017; Curado et al. 2021; Alessandretti et al. 2016; Zheng et al. 2018; Strano et al. 2015). Some of these works consider multilayered public transport networks in which layers correspond to either lines or modes of transportation.
In this section, we introduce a primal (Porta et al. 2006a) multiplex representation of public transport networks in which layers represent lines. We assume the layers to be nodealigned so that each layer contains an instance of each physical node representing a stop within the city limits. We denote the number of physical nodes by n and the number of layers by L. Each copy of a physical node in any of the layers is called a nodelayer pair.
We distinguish between two types of edges that can connect pairs of nodelayer pairs: intra and interlayer edges. Intralayer edges connect nodelayer pairs belonging to the same layer whenever the corresponding line directly connects the two stops, i.e., if a line approaches stops ABC, then A and B as well as B and C are connected by an edge, but A and C are not. Interlayer edges, on the other hand, connect instances of the same physical nodes belonging to different layers whenever both lines serve the corresponding stop, i.e., if the stop can be used to change between the lines. We discuss modeling approaches to assign weights to both intra and interlayer connections later in this section.
In our multiplex networks we restrict ourselves to undirected edges although directed models taking travel directions into account would appear suitable for the application, cf. our approach to network orientations in the "Public transport network orientations" section. Our choice accounts for the structure of the GTFS data, which contains multiple different stopID’s, i.e., physical nodes per stop. This structure entails that, e.g., a bus stop with one stop on each side of a street but the same name is not represented by one but two physical nodes. Employing a strategy to aggregate stops with the same name is errorprone as it provokes unexpected behavior when different data providers use inconsistent stop naming logics, e.g., in and excluding track numbers in stop names. In order to prevent in and outbound traffic of a stop to be unequally distributed across two different physical nodes we use undirected edges, which equally represent in and outbound traffic at the involved nodelayer pair regardless of the current travel direction. In the example scenario described above this leads to both stopID’s carrying the same in and outbound information. This way of “counting each connection twice” appears preferable over each nodelayer pair of the network carrying only half of the information available for the respective stop.
More formally, the multiplex networks described so far can be defined as the multilayer graph \({\mathcal {G}}=({\tilde{{\mathcal {V}}}}, {\mathcal {E}}^{(1)}, \dots , {\mathcal {E}}^{(L)}, {\tilde{{\mathcal {E}}}})\) consisting of a common vertex set \({\tilde{{\mathcal {V}}}}\) for all layers, intralayer edge sets \({\mathcal {E}}^{(l)}\) for all layers \(l=1, 2, \dots , L\), and an interlayer edge set \({\tilde{{\mathcal {E}}}}\). Note that similar networks have been considered before, e.g., in Taylor et al. (2017, 2019, 2021), Bergermann and Stoll (2021) but in this paper we employ a different notion of interlayer edges, which is determined by the data.
We choose a supraadjacency matrix representation as the linear algebraic formulation of these multiplex networks (Kivelä et al. 2014; Taylor et al. 2017, 2019, 2021; Bergermann and Stoll 2021). The two different types of edges are represented by two separate matrices. The multilayer intralayer adjacency matrix \(\varvec{A}_{\mathrm {intra}}\in {\mathbb {R}}^{nL \times nL}\) contains the edges representing connections by public transport lines and is defined as the blockdiagonal matrix
with \(\varvec{0} \in {\mathbb {R}}^{n \times n}\) the matrix of all zeros. Each blockdiagonal entry \(\varvec{A}^{(l)}\in {\mathbb {R}}^{n \times n}\) corresponds to the adjacency matrix of layer l. It contains the weight of the edge between the physical nodes i and j in layer l in the entry \([\varvec{A}^{(l)}]_{ij}\) and zero if no edge is present for \(i,j\in \{1, 2, \dots , n\}\) and \(l\in \{1, 2, \dots , L\}\).
The interlayer adjacency matrix \(\varvec{A}_{\mathrm {inter}}\in {\mathbb {R}}^{nL \times nL}\) contains the edges representing possible changes between public transport lines and is defined as
where
with \(l, k \in \{1, 2, \dots , L\}\) such that \(\varvec{O}^{(lk)}\in {\mathbb {R}}^{n \times n}\) contains ones only on a subset of its diagonal entries. Thus, the presence of interlayer edges depends on whether the lines represented by layers l and k both stop at a given physical node. We set the blockdiagonal of \(\varvec{A}_{\mathrm {inter}}\) to zero matrices \(\varvec{0}\in {\mathbb {R}}^{n \times n}\) as these entries would not represent a change of lines.
On the one hand, the above definition of \(\varvec{A}_{\mathrm {inter}}\) is more general than those in Taylor et al. (2017, 2019, 2021), Bergermann and Stoll (2021), where each \(\varvec{O}^{(lk)}\) is a full identity matrix, as it includes all combinations of up to n zero diagonal entries. On the other hand, the construction in Taylor et al. (2017, 2019, 2021), Bergermann and Stoll (2021) allows each layerlayer pair to be coupled with a different weight, while \(\varvec{A}_{\mathrm {inter}}\) defined in Eq. (2) is unweighted. We will introduce the parameter \(\omega\) in Eq. (3) to scale the interlayer adjacency matrix, which can be interpreted as modeling a constant transfer time between all lines at all stops of the network. Note that all numerical methods would be equally applicable if each interlayer edge was weighted individually.
The definitions in Eqs. (1) and (2) now allow us to define the supraadjacency matrix \(\varvec{A}\in {\mathbb {R}}^{nL \times nL}\) of the multiplex network as
where \(\omega \in {\mathbb {R}}_{\ge 0}\) is a scalar coupling parameter that trades off the relative importance of intra and interlayer weights. Note that by our choice of undirected edges discussed earlier in this section we have \(\varvec{A}=\varvec{A}^T\) throughout this paper.
We visualize the above definitions by a small example multiplex network in Fig. 6. The left plot shows a multiplex network that consists of three layers (two tram lines and one bus line) taken from the full multiplex network of the German city Freiburg. Only the subset of physical nodes served by these lines is included in the figure. The right plot shows the sparsity structure of the corresponding supraadjacency matrix. Both illustrations distinguish the two types of edges by different colors.
We remark that the supraadjacency matrix \(\varvec{A}\) of our public transport networks is characterized by a high degree of sparsity. This is true by the definitions in Eqs. (1) and (2), but our application is characterized by additional sparsity due to the facts that each line typically only serves a small subset of stops in the city and that most layers are approximately represented by chain graphs in which the number of edges is of the order of the number of nonisolated nodelayer pairs. Table 1 illustrates the most important multiplex network properties of the 36 German cities considered in the "Public transport network orientations" section. It confirms the statements above: around 95–98\(\%\) of the nodelayer pairs in the networks are isolated, i.e., possess no incident edge and the number of intralayer edges is close to the number of nonisolated nodelayer pairs for all networks.
Finally, we comment on our approach to assigning weights to the edges of the multiplex networks. Many weighted urban network models rely on choosing either the geographical distance (Crucitti et al. 2006a, b; Porta et al. 2006a) or travel times (Aleta et al. 2017; Bast et al. 2016) as weights, which is a sensible and straightforward choice for many problems like, e.g., routing algorithms. We, however, aim at applying matrix functionbased centrality measures to the multiplex networks in which a nodelayer pair is considered central if it is connected to many other nodelayer pairs by short walks along edges with large weights. To decide what it means to be “close” to many nodelayer pairs in a public transport network, we rely on the combination of two concepts: the frequency of the public transport line (Aleta et al. 2017) and a Gaussian kernel applied to travel times. Gaussian kernels have become a wellestablished choice of similarity measures in many datadriven applications (Schölkopf and Smola 2002; Von Luxburg 2007; Stoll 2020; Bergermann et al. 2021).
We define the frequency \(\phi ^{(l)}_{ij}\in {\mathbb {N}}\) as the number of connections that line l offers between stops i and j per day. Furthermore, we denote the travel time between nodelayer pairs (i, l) and (j, l) with \(i,j\in \{1, 2, \dots , n\}\) and \(l\in \{1, 2, \dots , L\}\) by \((\Delta t)_{ij}^{(l)}\in {\mathbb {R}}_{\ge 0}\) and use a Gaussian kernel to define the similarity of them as \(\exp \left( \frac{\left( (\Delta t)_{ij}^{(l)} \right) ^2}{\sigma ^2}\right) \in (0,1]\). The scalar parameter \(\sigma \in {\mathbb {R}}_{>0}\) determines the distribution of these similarities, e.g., \(\sigma \ll (\Delta t)_{ij}^{(l)}\) leads to most weights being close to zero and \(\sigma \gg (\Delta t)_{ij}^{(l)}\) leads to most weights being close to one. As this parameter scales the travel times of public transport lines in the urban network it can be interpreted as a “normalizing travel time”. Numerical experiments on the role of the parameter \(\sigma\) are presented in the "Influence of the normalizing travel time" section. As described at the beginning of this section, edge weights are set to zero if the corresponding layer offers no direct connection. Note, that information about both line frequencies and travel times can be extracted from GTFS data.
We allow different combinations of both weighting concepts by defining the intralayer weights as
with \(w^{(l)}_{ij}\) equal to one of the following expressions:
With the fourth choice of weights we combine travel times and frequencies by multiplication of the two individual weights. This leads to a (linearly) proportional dependence of the weight on the frequency and a (nonlinearly) antiproportional dependence of the weight on the travel time. With this weighting approach, stops connected to many other stops via highly frequented routes with short travel times will be recognized as most central. We compare the different weighting approaches in the "Comparison of different weight models" section.
As described in the "Data" section, files containing information on transfer options and transfer times between different public transport lines are not required but only optional in GTFS data. Hence, we do not possess knowledge of realistic transfer times, which are represented by interlayer edge weights in our multiplex network formulation. We thus propose to utilize the unweighted interlayer adjacency matrix defined in Eq. (2) and to include the cost of changing lines into the coupling parameter \(\omega\) from Eq. (3) by defining a transfer time \(\Delta t_{\mathrm {transfer}}\in {\mathbb {R}}_{\ge 0}\), which we assume to be constant across all pairs of lines and all stops. For consistency, we also apply the same Gaussian kernel to \(\Delta t_{\mathrm {transfer}}\) whenever we include travel times in the weights of intralayer edges. Thus, the interlayer weights are given by
Future public transport models should include realistic individual transfer times for all stops and all combinations of lines serving these stops. From an implementation point of view these transfer times would be easy to include in our multiplex model without causing any issues for our algorithms. We envision that more detailed knowledge of transfer times (including distances and walking times between different lines at the same stop) of practitioners can lead to more advanced cityspecific multiplex public transport models in the future.
Matrix functionbased centralities
Methods to identify and rank the most important nodes of complex networks have a long history in complex network science (Katz 1953; Freeman 1977, 1978; Bonacich 1987; Brin and Page 1998; Kleinberg 1999; Page et al. 1999). Among many others, urban networks have been a key application of a variety of centrality measures including degree, closeness, betweenness, and different variants of eigenvector centralities (Crucitti et al. 2006a, b; Porta et al. 2006a; Scheurer and Porta 2006; To 2015; Nourian et al. 2016; Wang and Fu 2017; Agryzkov et al. 2019; Hellervik et al. 2019; Curado et al. 2021).
In this section, we consider matrix functionbased centrality measures (Benzi and Boito 2020; Estrada and Higham 2010; Estrada and RodriguezVelazquez 2005; Benzi and Klymko 2013; Katz 1953; Bergermann and Stoll 2021), which interpolate between local degree centrality and global eigenvector centrality (Benzi and Klymko 2015). An advantage of this class of centrality measures over eigenvector centrality is its more general applicability.^{Footnote 12} Recently, the class of matrix functionbased centrality measures has been generalized to layercoupled multiplex networks (Bergermann and Stoll 2021). We illustrate that the same methods are also applicable to the more general multiplex network framework introduced in the "Multiplex network model" section. In the remainder of this section, we briefly motivate and introduce matrix functionbased centrality measures and present efficient numerical methods for their computation as well as numerical results including a discussion of the impact of certain hyperparameters.
Definition
Matrix functionbased centrality measures are based on the application of the matrix exponential or the matrix resolvent function to the adjacency matrix of a network. We briefly motivate the definitions by considering walks on singlelayer networks and refer the reader to Bergermann and Stoll (2021, Sec. 3) for more details. At the end of this subsection, we comment on the generalization of the definitions to the case of the multiplex networks introduced in the "Multiplex network model" section.
In the case of an undirected and unweighted singlelayer graph with n nodes, the entry \([\varvec{A}]_{ij}\) of the adjacency matrix \(\varvec{A}\in {\mathbb {R}}^{n \times n}\) is 1 if an edge is present between nodes i and j and 0 otherwise. Note that in the formalism introduced in the "Multiplex network model" section this corresponds to \(L=1\) and \(\varvec{A}=\varvec{A}_{\mathrm {intra}}=\varvec{A}^{(1)}\). It is wellknown from graph theory that an entry \([\varvec{A}^p]_{ij}\) of the pth power of such an adjacency matrix contains the number of walks of length p that exist between nodes i and j (Estrada 2012). While (local) degree centrality can be formulated in terms of the (first power of the) adjacency matrix and (global) eigenvector centrality in terms of the limit of adjacency matrix powers, matrix functionbased centralities consider walks of all lengths. This approach can be represented by considering the adjacency matrix power series \(\sum _{p=0}^{\infty } \varvec{A}^p\). Furthermore, one typically introduces a damping factor in this power series to control the magnitude of the entries of the matrix powers, which typically grows rapidly with p. The following two choices of damping factors lead to the power series of the matrix exponential (left) and the matrix resolvent function (right)
where \(\varvec{I}\in {\mathbb {R}}^{n \times n}\) denotes the identity matrix and \(\beta \in {\mathbb {R}}_{>0}\) and \(0<\alpha <1/\lambda _{\mathrm {max}}\)^{Footnote 13} denote scalar hyperparameters. These parameters control the tradeoff of locality and globality in the considered walks on the network by assigning more or less weight to longer walks.
The centrality of any given node i in the network is then revealed by the following two expressions of these matrix functions: \(\varvec{e}_i^T f(\varvec{A}) \varvec{e}_i\) denotes the diagonal element of \(f(\varvec{A})\), which contains the weighted sum of walks of all lengths, which start and end at node i; \(\varvec{e}_i^T f(\varvec{A})\varvec{1}\) denotes the sum of the ith row of \(f(\varvec{A})\), which contains the weighted sum of all walks starting at node i (regardless of where they end). Here, \(\varvec{e}_i=[0, \dots , 0, 1, 0, \dots , 0]^T\in {\mathbb {R}}^n\) denotes the ith unit vector and \(\varvec{1}=[1, 1, \dots , 1]^T\in {\mathbb {R}}^n\) the vector of all ones. This leads to the definitions of subgraph centrality (Estrada and RodriguezVelazquez 2005) (left) and resolventbased subgraph centrality (Benzi and Boito 2020) (right)
as well as total communicability (Benzi and Klymko 2013) (left) and Katz centrality (Katz 1953) (right)
The same definitions hold true for weighted graphs and although extensions to directed networks exist (Bergermann and Stoll 2021), we restrict ourselves to undirected networks in this paper, which leads to symmetric adjacency matrices \(\varvec{A}^T=\varvec{A}\).
It has recently been proposed to generalize the above definitions to the case of layercoupled multiplex networks by replacing the graph adjacency matrix by the supraadjacency matrix of the corresponding multiplex networks (Bergermann and Stoll 2021, Sec. 4). We will demonstrate that all methods from Bergermann and Stoll (2021) for the symmetric case \(\varvec{A}=\varvec{A}^T\) still apply for the more general multiplex networks introduced in the "Multiplex network model" section.
Note that by the construction of the supraadjacency matrix defined in Eq. (3) the above quantities yield centrality values for all nodelayer pairs, which allows to identify and rank the most central nodelayer pairs of the network. Following Taylor et al. (2017), we call these values joint centralities and denote the joint centrality of physical node i in layer l by JC(i, l). However, we remark that the numerical methods introduced in the following subsection are unable to compute subgraph and resolventbased subgraph centralities of isolated nodelayer pairs, i.e., nodelayer pairs without any adjacent edge. We set the centrality values of these nodelayer pairs to 1, which is consistent with Eq. (6), in which the respective rows and columns of the supraadjacency matrix power series are given by unit vectors.
Finally, we adapt the concept of marginal node (MNC) and marginal layer centralities (MLC) (Taylor et al. 2017). These quantities are defined as
and can be used to assess the importance of all physical nodes and layers of the multiplex network, which will play a key role in our interpretation of the results.
Numerical methods
We now discuss strategies for the numerical evaluation of the matrix functionbased centrality measures defined in Eqs. (7) and (8). For small network sizes, many software packages provide accurate algorithms for the explicit evaluation of the matrix exponential and the solution of a regular linear system (an efficient numerical implementation of, e.g., \((\varvec{I}  \alpha \varvec{A})^{1} \varvec{1}\) would solve the linear system \((\varvec{I}  \alpha \varvec{A}) \varvec{x} = \varvec{1}\) for the vector \(\varvec{x}\)). These methods, however, quickly become infeasible for medium to large network sizes and we briefly present efficient and scalable approximations based on Krylov subspace methods. More details can be found in Benzi and Boito (2020) for singlelayer networks and in (Bergermann and Stoll 2021, Sec. 5) for multiplex networks.
As we only consider undirected graphs, i.e., symmetric supraadjacency matrices in this paper we rely on the symmetric methods introduced in (Bergermann and Stoll 2021, Sec. 5), which are based on the symmetric Lanczos method (Lanczos 1950; Golub and Van Loan 2013). In its kth iteration, this method constructs the kth column of an orthogonal basis \(\varvec{Q}_k\in {\mathbb {R}}^{nL \times k}\) of the Krylov subspace
to a matrix \(\varvec{A}=\varvec{A}^T\in {\mathbb {R}}^{nL \times nL}\) and a vector \(\varvec{v}\in {\mathbb {R}}^{nL}\). This allows the decomposition of \(\varvec{A}\) into the form \(\varvec{A} \approx \varvec{Q}_k \varvec{T}_k \varvec{Q}_k^T\), where \(\varvec{T}_k=\varvec{T}_k^T\in {\mathbb {R}}^{k \times k}\) has tridiagonal form. This approximation typically achieves a high accuracy for \(k \ll nL\), which makes the eigendecomposition \(\varvec{T}_k = \varvec{S}_k \varvec{\Theta }_k \varvec{S}_k^T\) easy to compute with standard methods. These two matrix factorizations can then be combined to compute total communicability and Katz centrality by evaluating the quantity
where f is applied elementwise to the eigenvalues of \(\varvec{T}_k\) (Higham 2008). The ith entry of the resulting vector then corresponds to the centrality value of the ith nodelayer pair.
For subgraph and resolventbased subgraph centrality we rely on an elegant relation between the symmetric Lanczos method, orthogonal polynomials, and Gauss quadrature discussed by Golub and Meurant (1994, 1997, 2009), Golub and Welsch (1969), which can be used to compute lower and upper bounds on the sought quantities. The final result of this approach yields a lower Gauss quadrature bound, which reads
where the computation of \(\varvec{T}_k=\varvec{S}_k \varvec{\Theta }_k \varvec{S}_k^T\) relies on the basis of the Krylov subspace from Eq. (10) with \(\varvec{v}=\varvec{e}_i\). We refer to Golub and Meurant (2009); Golub and Welsch (1969); Golub and Meurant (1994, 1997) and Bergermann and Stoll (2021, Sec. 5.2.1) for theoretical background on this approximation as well as details on the construction of Gauss–Radau and Gauss–Lobatto rules, which yield an additional lower as well as two upper bounds on \(\varvec{e}_i^T f(\varvec{A}) \varvec{e}_i\). Note that due to the high degree of sparsity in the supraadjacency matrices of our multiplex public transport networks, Gauss quadrature rules can only be applied to the small subset of nonisolated nodelayer pairs. As described in the previous subsection, the centrality value of the remaining nodelayer pairs is set to 1.
All numerical methods introduced in this subsection scale linearly in the number of nodelayer pairs under the assumption of sparsity in the supraadjacency matrix, cf. (Bergermann and Stoll 2021, Sec. 5), and usually obtain highly accurate approximations in only around 10 Lanczos iterations. However, the quantities \(\varvec{e}_i^T f(\varvec{A}) \varvec{e}_i\) require the approximation of a separate quantity for each nonisolated nodelayer pair, which makes them computationally more demanding than the quantities \(f(\varvec{A})\varvec{1}\), which only need to be computed once. As discussed in Bergermann and Stoll (2021, Sec. 6.1), the convergence of Eq. (12) is usually faster than that of Eq. (11). A python implementation of the above methods is available at https://github.com/KBergermann/Urbanmultiplexnetworks.
Results
In this subsection, we present numerical results of the different matrix functionbased centrality measures applied to the multiplex network representation of several German urban public transport networks. For the interpretation of the resulting rankings of nodelayer pairs (representing stopline pairs), we mainly rely on marginal node centralities defined in Eq. (9). This approach corresponds to the identification and ranking of the most central locations in urban networks, which has been the subject of many earlier studies with different centrality measures (Crucitti et al. 2006a, b; Porta et al. 2006a; Scheurer and Porta 2006; To 2015; Nourian et al. 2016; Agryzkov et al. 2019; Curado et al. 2021). We compare different weight models and matrix functionbased centrality measures, which were introduced in the preceding sections. Furthermore, we study the impact of all involved hyperparameters and close with a discussion of the obtained results including their relation to public transport orientations.
Previous results from the literature on the application of centrality measures to urban transport networks indicate qualitative differences in the distribution of the most central nodes between the cases of public transport (Scheurer and Porta 2006; To 2015; Curado et al. 2021) and street networks (Crucitti et al. 2006a, b; Porta et al. 2006a; Nourian et al. 2016; Agryzkov et al. 2019). As street networks are usually modeled as (almost) planar graphs in which the “closeness” of nodes is determined by their geographical distance the distribution of central nodes is usually characterized by a relatively smooth transition from more to less central nodes in terms of this geographical distance. For instance, one often observes approximately circular shapes of descending centrality around the most central node of such a network (Crucitti et al. 2006a, b; Porta et al. 2006a; Nourian et al. 2016; Agryzkov et al. 2019). Conversely, in public transport networks the “closeness” of nodes is typically tied to different indicators such as travel times or line frequencies. This results in a less smooth distribution of node centralities with respect to the geographical position of the nodes. For instance, stops, which are geographically very close to the most central stop of a public transport network can be classified as noncentral if the two stops are not directly connected by any line (Scheurer and Porta 2006; To 2015; Curado et al. 2021). This general qualitative behavior is confirmed by the multiplex matrix functionbased centrality measures and can be observed throughout the subsequent results. Note that a combination of the different modeling approaches by, e.g., adding car or walking layers to public transport networks could mitigate this effect and constitute an interesting future research direction.
To illustrate this general behavior for our modeling approach we start by considering marginal layer and marginal node subgraph centralities of the German city Halle (Saale) as a first example. Figure 7a illustrates the ranking of lines while Fig. 7b illustrates the ranking of stops in the corresponding multiplex public transport network. All marginal centrality plots in this paper employ a heat map color scheme consisting of six categories representing centrality value subintervals of equal length. While there is a tendency of geographically central stops to be classified as central public transport stops there are various exceptions in the form of dark blue (least central) stops in or around the city’s geographical center, cf. Fig. 7b. Instead, the comparison with Fig. 7a shows that there is a tendency of central stops to line up along important transport axes of the urban public transport system. It is also interesting to note that the orientations of the most central lines in Fig. 7a coincide with the preferential directions of the public transport network identified in Fig. 3.
Comparison of different measures
We continue our discussion with the comparison of the four matrix functionbased centrality measures for the multiplex public transport networks defined in Eqs. (7) and (8) at the example of marginal node centralities of Cologne.
Figure 8 reflects the typical behavior of matrix functionbased centrality measures that rankings produced by the four measures are similar but not equal (Benzi and Boito 2020; Bergermann and Stoll 2021). For the multiplex public transport networks considered in this paper one typically observes very similar results for SC and \(SC_{\mathrm {res}}\) as well as for TC and KC. However, SC and \(SC_{\mathrm {res}}\) typically show a centrality value distribution that is more uniformly distributed across the six centrality value categories, which results in more stops being classified into the top three categories (marked in red and orange) than it is the case for TC and KC. The difference between the two groups of measures is that SC and \(SC_{\mathrm {res}}\) are defined by the matrix functions’ diagonal entries, while offdiagonal entries of the matrix functions are additionally included in TC and KC. This corresponds to SC and \(SC_{\mathrm {res}}\) only considering closed walks on the networks whereas TC and KC consider all walks. It thus follows that the inclusion of the offdiagonals amplifies the quantitative ranking of the stops due to a steeper distribution of the offdiagonals in comparison to the diagonal entries of the matrix functions.
Comparison of different weight models
We also compare the different weight models discussed in the "Multiplex network model" section. The modeling approaches include all combinations of in and excluding travel times and line frequencies in the intralayer weights \(w^{(l)}_{ij}\) as specified in Eq. (4). Whenever intralayer travel times are included in \(w^{(l)}_{ij}\) we also include the transfer time between lines in the coupling parameter \(\omega\) as defined in Eq. (5).
Figure 9 presents marginal node resolventbased subgraph centralities of Stuttgart in the different scenarios. It shows that the inclusion of line frequencies has a larger impact on the centralities than travel times do. Interestingly, the exclusion of line frequencies has the effect of advantaging stops with large interlayer degrees, i.e., stops with many transfer options. In the case of Fig. 9a, c this leads to “Stuttgart Universität” (Stuttgart university) being classified as the most central stop of the city albeit being geographically located near the city limit. Later in the "Impact of transfer times" section we discuss an interesting property of weight models without frequencies regarding the influence of transfer times on marginal node centralities.
Figure 9b, d, which include line frequencies, show what is likely to be a more expectable stop ranking for Stuttgart. Here, the increased frequencies of lines serving the geographical center of the city shift the highest ranked stops to this central region.
Overall, Fig. 9 suggests that travel times do not have a large impact on the obtained stop rankings. However, their inclusion adds an important ingredient for a realistic modeling of urban public transport networks as it allows the inclusion of transfer times between lines.
In the following three subsections, we discuss the influence of all involved hyperparameters on the obtained stop rankings. These parameters include the normalizing travel time \(\sigma\), the matrix function parameters \(\alpha\) and \(\beta\), and the transfer time \(\Delta t_{\mathrm {transfer}}\).
Influence of the normalizing travel time
Numerical experiments suggest that the normalizing travel time \(\sigma\) only has a small impact on the obtained rankings in a small parameter range: while increasing \(\sigma\) from 0.2 to 2 (corresponding to 12 seconds and 2 min normalizing travel time, respectively) slightly increases the number of highly ranked stops in the city center, a further increase of \(\sigma\) to 20 or 200 min has no noticeable effect.
From local to global
As proven for singlelayer networks in Benzi and Klymko (2015) and as confirmed empirically for layercoupled multiplex networks in Bergermann and Stoll (2021, Sec. 6), all four matrix functionbased centrality measures defined in Eqs. (7) and (8) contain degree and eigenvector centrality as limit cases of the parameters \(\alpha\) or \(\beta\). In this subsection, we empirically confirm this behavior for marginal node centralities of our more general multiplex network models.
We illustrate this at the example of marginal node Katz centralities of Düsseldorf in Fig. 10. As discussed before, the admissible parameter range for \(\alpha\) for this measure is given by \((0, 1/\lambda _{\mathrm {max}})\), where \(\lambda _{\mathrm {max}}\) is the largest eigenvalue of the supraadjacency matrix. In Fig. 10a, \(\alpha =0.01/\lambda _{\mathrm {max}}\) is chosen to be close to the lower end of that interval, which corresponds to being close to degree centrality in which only direct neighbors of each nodelayer pair are considered. The other extreme is represented by Fig. 10c in which \(\alpha =0.99/\lambda _{\mathrm {max}}\) is chosen to be close to the upper end of the admissible interval, which corresponds to being close to eigenvector centrality in which nodes are considered central if their closest neighbors are also central.
It can be seen in Fig. 10a that the local case is characterized by a relatively uniform distribution of marginal node centralities into the six centrality categories. Furthermore, relatively central stops are geographically distributed across the whole city. In this situation, each local neighborhood can possess its own central stops, regardless of their relative importance for the whole city. Conversely, eigenvector centrality in Fig. 10c considers the stationary distribution of any initial distribution of walkers on the nodelayer pairs of the network. This often leads to localization effects (Martin et al. 2014), which entail a centrality distribution that is almost uniform for all but a few very central stops. The strength of matrix functionbased centrality measures now lies in the fact that the choice of \(\alpha\) allows to continuously interpolate between these two established concepts of centrality measures. In Fig. 10b, the choice of \(\alpha =0.75/\lambda _{\mathrm {max}}\) illustrates this property by being “visually in between” the two extreme cases.
Impact of transfer times
In Eq. (5) we specified our approach to modeling a constant transfer time across all lines and stops of the network by means of the coupling parameter \(\omega\), which is defined by a Gaussian kernel applied to the transfer time \(\Delta t_{\mathrm {transfer}}\). We start by presenting an example of a multiplex network of Chemnitz in which intralayer edges are weighted with travel times and line frequencies. However, an interesting property more frequently emerges when line frequencies are excluded from the intralayer weight model. We conclude the results section with a parameter study of the coupling parameter \(\omega\) in the situation without frequencies.
Figure 11 illustrates marginal node total communicabilities of Chemnitz with the constant transfer time varying between 0 and 15 min. Assigning no cost for changing lines in Fig. 11a leads to Chemnitz’s stop “Zentralhaltestelle” to be ranked as the most central stop of the city by a large margin. The role of this stop is encoded in its name, which literally translates to “central stop”. The public transport system of Chemnitz is organized such that most of the city’s lines stop at this geographically central stop, i.e., it offers by far the most opportunities to change lines. However, this characteristic forfeits its significance as a cost for changing lines is introduced. Figure 11b shows that many other stops in and outside of the geographical city center are classified as central when the transfer time is increased to 5 min. Figure 11c illustrates that a further increase from 5 to 15 min (and above) has no significant additional impact. This behavior, however, emerges only for a relatively small normalizing travel time of \(\sigma =1\). For larger parameters \(\sigma\) or other cities with a more uniform distribution of interlayer degrees this behavior is less pronounced when line frequencies are included in the weight model.
Excluding line frequencies from the weight model leads to a stronger localization of marginal node centralities in the situation \(\Delta t_{\mathrm {transfer}} \rightarrow 0\) across different networks and a larger range of \(\sigma\). The effect of the variation of \(\Delta t_{\mathrm {transfer}}\) and subsequently \(\omega\) on marginal layer and marginal node centralities is illustrated in Fig. 12. Here, we observe an interesting clustering behavior of marginal node centralities in the strong coupling limit in which \(\omega\) approaches its maximum value of 1 corresponding to \(\Delta t_{\mathrm {transfer}}=0\). These clusters are determined by the stops’ interlayer degrees, i.e., by the number of lines stopping at the corresponding stop. This behavior reflects the accessibility of a larger part of the network when a larger number of lines can be used within a constant total travel time (defined as the sum of intra and interlayer travel times).
Another interesting observation in Fig. 12 concerns the peak in most marginal layer and marginal node centralities between \(\omega =0.01\) and \(\omega =0.1\). This phenomenon has been similarly encountered in other applications before (Bergermann and Stoll 2021, Sec. 6.3) and is an interesting question for future research.
Discussion
The preceding subsections gave a detailed account of the differences between the introduced centrality measures and weight models as well as the influence of all involved hyperparameters. Earlier in this paper, we commented on conclusions obtained from public transport network orientations and their relation to the corresponding street network orientations. We now discuss the relation between these two aspects of urban public transport networks.
Figure 13 provides an overview of marginal node Katz centralities for all 36 German cities from Fig. 3 using a common set of hyperparameters, which allows a direct comparison of the results. While we identified a clear pattern in the public transport network orientations across all German cities the centrality plots in Fig. 13 give a mixed picture. Here, we observe various different distributions of central stops both in terms of the relative number of central stops per city as well as their geographical arrangement.
In Fig. 14 we focus on four example cities with rather different centrality distributions, which are not necessarily reflected in the corresponding public transport orientation plots. Figure 14a shows that Bielefeld has only a small number of central stops located at the geographical city center. This center is the origin of various sequences of light blue stops (indicating the second to last central category), which seem to equally spread out into all directions and not only into the preferential directions of the corresponding orientation plot. Figure 14b reveals that several central stops are spread out over the city area of Munich. However, the connections between these central stops can not clearly be linked to the preferential direction of the corresponding orientation plot. In Fig. 14c, d and several other examples we observe highly central sequences of stops lining up along major public transport axes. In some cases, the directions of these axes can be linked to the most pronounced bearings of the corresponding orientation plots, cf. Karlsruhe in Fig. 14d but also Braunschweig, Erfurt, Halle (Saale), Hanover, Münster, and Rostock. This pattern, however, can not be observed in other examples like Duisburg in Fig. 14c, Cologne, or Stuttgart.
We conclude that orientation plots of street and public transport networks are capable of revealing interesting highlevel properties of urban systems. For a more detailed analysis of the geometrical properties of cities like, e.g., the spatial distribution of major transport axes additional measures must be taken into account. This manuscript proposes multiplex matrix functionbased centralities as one such measure.
Conclusion
We studied two geometrical aspects of urban public transport networks: orientations and centralities. We determined orientations of directed public transport networks of the 36 largest German as well as 18 major European cities and compared them to orientations of undirected street networks. All considered German cities revealed two more or less pronounced orthogonal preferential street network directions, which can often be linked to geographical constraints. We found that most considered German public transport network orientations concentrate around the one of the two preferential street network directions, which is closer to the cardinal eastwest axis. The same qualitative behavior could only be observed for a small subset of the considered European cities. However, northsouthlike preferential public transport directions remained rare.
Furthermore, we formally introduced urban public transport multiplex networks in which nodes correspond to stops and layers to lines and applied multiplex matrix functionbased centrality measures in order to identify and rank the most central lines and stops of the considered German cities. These measures generate rankings, which are consistent with previous investigations. In addition, they offer the benefit of being able to flexibly choose the desired degree of locality, possessing efficient and scalable numerical implementations, and being applicable to a wide range of problems. The influence of different hyperparameters, which all have meaningful interpretations in terms of the urban science application, was thoroughly studied. Our study showed that matrix functionbased centralities are capable of revealing insights into geometrical aspects of urban systems on a more granular level than orientation plots are.
We believe that multiplex models of cities, ideally combining various aspects like, e.g., various modes of transportation or additional aspects of urban life can contribute to a better understanding of urban systems. We hope that the presented methodology can add to urban scientists’ toolkits and that cityspecific modeling as well as parametertuning of domain experts yield further contributions to the economic, environmental, and social challenges lying ahead.
Availability of data and materials
All data sets and codes used in this paper are publicly available. Street network data can be obtained via the OSMnx python package https://osmnx.readthedocs.io/en/stable/. German GTFS data is available under https://gtfs.de/de/feeds/de_nv/ on a daily basis and the data set for April 22nd, 2021, which forms the basis of our numerical experiments on German cities can be downloaded from https://www.tuchemnitz.de/mathematik/wire/pubs/gtfsdata.tar.gz. GTFS data of selected European cities can be downloaded from https://transitfeeds.com/l/60europe. All numerical experiments presented in this paper can be reproduced by the python implementation, which is publicly available under https://github.com/KBergermann/Urbanmultiplexnetworks.
Notes
Eigenvector centrality is defined by the eigenvector belonging to the largest eigenvalue of a suitable matrix like, e.g., the graph’s supraadjacency matrix, cf. Taylor et al. (2017, 2019, 2021). The unique existence of this eigenvector can, however, only be guaranteed if the assumptions of the Perron–Frobenius theorem are satisfied [Taylor et al. 2021, Thm. 3.7]. This restriction does not apply to matrix functionbased centrality measures, cf. e.g. [Bergermann and Stoll 2021, Sec. 6.3] for an example. Note, however, that variants of eigenvector centrality circumventing this shortcoming exist, cf. e.g. Tudisco et al. (2018).
Abbreviations
 GTFS:

General transit feed specification (public transport data format)
 SC :

Subgraph centrality
 SC _{res} :

Resolventbased subgraph centrality
 TC :

Total communicability
 KC :

Katz centrality
 JC :

Joint centrality
 MNC :

Marginal node centrality
 MLC :

Marginal layer centrality
References
Acuto M, Parnell S, Seto KC (2018) Building a global urban science. Nat Sustain 1(1):2–4. https://doi.org/10.1038/s4189301700139
Agryzkov T, Tortosa L, Vicent JF, Wilson R (2019) A centrality measure for urban networks based on the eigenvector centrality concept. Environ Plan B Urban Anal City Sci 46(4):668–689. https://doi.org/10.1177/2399808317724444
Alessandretti L, Karsai M, Gauvin L (2016) Userbased representation of timeresolved multimodal public transportation networks. R Soc Open Sci 3(7):160156. https://doi.org/10.1098/rsos.160156
Aleta A, Meloni S, Moreno Y (2017) A multilayer perspective for the analysis of urban transportation systems. Sci Rep 7(1):1–9. https://doi.org/10.1038/srep44359
Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512. https://doi.org/10.1126/science.286.5439.509
Barthélemy M (2011) Spatial networks. Phys Rep 499(1–3):1–101. https://doi.org/10.1016/j.physrep.2010.11.002
Barthélemy M, Flammini A (2008) Modeling urban street patterns. Phys Rev Lett 100(13):138702. https://doi.org/10.1103/PhysRevLett.100.138702
Barthélemy M, Bordin P, Berestycki H, Gribaudi M (2013) Selforganization versus topdown planning in the evolution of a city. Sci Rep 3(1):1–8. https://doi.org/10.1038/srep02153
Barthélemy M (2016) The structure and dynamics of cities. Cambridge University Press, Cambridge. https://doi.org/10.1017/9781316271377
Bast H, Delling D, Goldberg A, MüllerHannemann M, Pajor T, Sanders P, Wagner D, Werneck RF (2016) Route planning in transportation networks. In: Algorithm engineering, pp. 19–80. Springer, Switzerland. https://doi.org/10.1007/9783319494876
Batty M (2008) The size, scale, and shape of cities. Science 319(5864):769–771. https://doi.org/10.1126/science.1151419
Benzi M, Boito P (2020) Matrix functions in network analysis. GAMMMitteilungen 43(3):202000012. https://doi.org/10.1002/gamm.202000012
Benzi M, Klymko C (2013) Total communicability as a centrality measure. J Complex Netw 1(2):124–149. https://doi.org/10.1093/comnet/cnt007
Benzi M, Klymko C (2015) On the limiting behavior of parameterdependent network centrality measures. SIAM J Matrix Anal Appl 36(2):686–706. https://doi.org/10.1137/130950550
Bergermann K, Stoll M (2021) Matrix functionbased centrality measures for layercoupled multiplex networks. arXiv:2104.14368
Bergermann K, Stoll M, Volkmer T (2021) Semisupervised learning for aggregated multilayer graphs using diffuse interface methods and fast matrixvector products. SIAM J Math Data Sci 3(2):758–785. https://doi.org/10.1137/20M1352028
Boccaletti S, Bianconi G, Criado R, Del Genio CI, GómezGardenes J, Romance M, SendinaNadal I, Wang Z, Zanin M (2014) The structure and dynamics of multilayer networks. Phys Rep 544(1):1–122. https://doi.org/10.1016/j.physrep.2014.07.001
Boeing G (2017) OSMnx: new methods for acquiring, constructing, analyzing, and visualizing complex street networks. Comput Environ Urban Syst 65:126–139. https://doi.org/10.1016/j.compenvurbsys.2017.05.004
Boeing G (2019) Urban spatial order: street network orientation, configuration, and entropy. Appl Netw Sci 4(1):1–19. https://doi.org/10.1007/s4110901901891
Bolay JC (2020) Urban planning against poverty: how to think and do better cities in the global south. Springer, Switzerland. https://doi.org/10.1007/9783030284190
Bonacich P (1987) Power and centrality: a family of measures. Am J Sociol 92(5):1170–1182. https://doi.org/10.1086/228631
Brin S, Page L (1998) The anatomy of a largescale hypertextual web search engine. Comput Netw ISDN Syst 30(1–7):107–117. https://doi.org/10.1016/S01697552(98)00110X
Chan SH, Donner RV, Lämmer S (2011) Urban road networksspatial networks with universal geometric features? Eur Phys J B 84(4):563–577. https://doi.org/10.1140/epjb/e2011108893
Courtat T, Gloaguen C, Douady S (2011) Mathematics and morphogenesis of cities: a geometrical approach. Phys Rev E 83(3):036106. https://doi.org/10.1103/PhysRevE.83.036106
Crucitti P, Latora V, Porta S (2006a) Centrality measures in spatial networks of urban streets. Phys Rev E 73(3):036125. https://doi.org/10.1103/PhysRevE.73.036125
Crucitti P, Latora V, Porta S (2006b) Centrality in networks of urban streets. Chaos Interdiscip J Nonlinear Sci 16(1):015113. https://doi.org/10.1063/1.2150162
Curado M, Tortosa L, Vicent JF, Yeghikyan G (2021) Understanding mobility in Rome by means of a multiplex network with data. J Comput Sci 51:101305. https://doi.org/10.1016/j.jocs.2021.101305
De Domenico M, SoléRibalta A, Omodei E, Gómez S, Arenas A (2015) Ranking in interconnected multilayer networks reveals versatile nodes. Nat Commun 6(1):1–6. https://doi.org/10.1038/ncomms7868
Estrada E (2012) The structure of complex networks: theory and applications. Oxford University Press, Inc., Oxford. https://doi.org/10.1093/acprof:oso/9780199591756.001.0001
Estrada E, Higham DJ (2010) Network properties revealed through matrix functions. SIAM Rev 52(4):696–714. https://doi.org/10.1137/090761070
Estrada E, RodriguezVelazquez JA (2005) Subgraph centrality in complex networks. Phys Rev E 71(5):056103. https://doi.org/10.1103/PhysRevE.71.056103
Euler L (1741) Solutio problematis ad geometriam situs pertinentis. Comment Acad Sci Petropolitanae, 128–140
Freeman LC (1977) A set of measures of centrality based on betweenness. Sociometry 40(1):35–41. https://doi.org/10.2307/3033543
Freeman LC (1978) Centrality in social networks conceptual clarification. Soc Netw 1(3):215–239. https://doi.org/10.1016/03788733(78)900217
Golub GH, Van Loan CF (2013) Matrix aomputations, vol 3. JHU Press, Baltimore
Golub GH, Meurant G (1994) Matrices, moments and quadrature. Pitman Res Notes Math Ser 303:105–156
Golub GH, Meurant G (1997) Matrices, moments and quadrature II; How to compute the norm of the error in iterative methods. BIT Numer Math 37(3):687–705. https://doi.org/10.1007/BF02510247
Golub GH, Welsch JH (1969) Calculation of Gauss quadrature rules. Math Comput 23(106):221–230. https://doi.org/10.1090/S0025571869996471
Golub GH, Meurant G (2009) Matrices, moments and quadrature with applications. Princeton University Press, Princeton. https://doi.org/10.1515/9781400833887
Gudmundsson A, Mohajeri N (2013) Entropy and order in urban street networks. Sci Rep 3(1):1–8. https://doi.org/10.1038/srep03324
Hellervik A, Nilsson L, Andersson C (2019) Preferential centrality—a new measure unifying urban activity, attraction and accessibility. Environ Plan B Urban Anal City Sci 46(7):1331–1346. https://doi.org/10.1177/2399808318812888
Higham NJ (2008) Functions of matrices: theory and computation. SIAM, New York. https://doi.org/10.1137/1.9780898717778
Hong J, Tamakloe R, Lee S, Park D (2019) Exploring the topological characteristics of complex public transportation networks: focus on variations in both single and integrated systems in the Seoul metropolitan area. Sustainability 11(19):5404. https://doi.org/10.3390/su11195404
https://wiki.openstreetmap.org/wiki/Public_transport. Accessed 17 July 2021
Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43. https://doi.org/10.1007/BF02289026
Kivelä M, Arenas A, Barthélemy M, Gleeson JP, Moreno Y, Porter MA (2014) Multilayer networks. J Complex Netw 2(3):203–271. https://doi.org/10.1093/comnet/cnu016
Kivelä M (2017) Multilayer networks library for python (pymnet). https://github.com/bolozna/Multilayernetworkslibrary
Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632. https://doi.org/10.1145/324133.324140
Lanczos C (1950) An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. United States Governm. Press Office Los Angeles, CA, USA. https://doi.org/10.6028/jres.045.026
Martin T, Zhang X, Newman ME (2014) Localization and centrality in networks. Phys Rev E 90(5):052808. https://doi.org/10.1103/PhysRevE.90.052808
Milgram S (1967) The small world problem. Psychol Today 2(1):60–67
Mohajeri N, French JR, Batty M (2013) Evolution and entropy in the organization of urban street patterns. Ann GIS 19(1):1–16. https://doi.org/10.1080/19475683.2012.758175
Nourian P, Rezvani S, Sariyildiz I, van der Hoeven F (2016) Spectral modelling for spatial network analysis. In: Proceedings of the symposium on simulation for architecture and urban design (simAUD 2016). SimAUD
Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: bringing order to the web. Technical report, Stanford InfoLab
PallaresBarbera M, Badia A, Duch J (2011) Cerdá and Barcelona: the need for a new city and service provision. Urbani Izziv 22:122–136. https://doi.org/10.5379/urbaniizziven20112202005
Porta S, Crucitti P, Latora V (2006a) The network analysis of urban streets: a dual approach. Physica A 369(2):853–866. https://doi.org/10.1016/j.physa.2005.12.063
Porta S, Crucitti P, Latora V (2006b) The network analysis of urban streets: a primal approach. Environ Plann B Plann Des 33(5):705–725. https://doi.org/10.1068/b32045
Scheurer J, Porta S (2006) Centrality and connectivity in public transport networks and their significance for transport sustainability in cities. In: World planning schools congress, global planning association education network
Schölkopf B, Smola AJ (2002) Learning with Kernels: support vector machines, regularization, optimization, and beyond. MIT press, Cambridge. https://doi.org/10.7551/mitpress/4175.001.0001
Sharifi A (2019) Resilient urban forms: a review of literature on streets and street networks. Build Environ 147:171–187. https://doi.org/10.1016/j.buildenv.2018.09.040
Stoll M (2020) A literature survey of matrix methods for data science. GAMMMitteilungen 43(3):202000013. https://doi.org/10.1002/gamm.202000013
Strano E, Shai S, Dobson S, Barthélemy M (2015) Multiplex networks in metropolitan areas: generic features and local effects. J R Soc Interface 12(111):20150651. https://doi.org/10.1098/rsif.2015.0651
Taylor D, Myers SA, Clauset A, Porter MA, Mucha PJ (2017) Eigenvectorbased centrality measures for temporal networks. Multiscale Model Simul 15(1):537–574. https://doi.org/10.1137/16M1066142
Taylor D, Porter MA, Mucha PJ (2019) Supracentrality analysis of temporal networks with directed interlayer coupling. In: Temporal network theory, pp. 325–344. Springer, Switzerland. https://doi.org/10.1007/9783030234959
Taylor D, Porter MA, Mucha PJ (2021) Tunable eigenvectorbased centralities for multiplex and temporal networks. Multiscale Model Simul 19(1):113–147. https://doi.org/10.1137/19M1262632
To W (2015) Centrality of an urban rail system. Urban Rail Transit 1(4):249–256. https://doi.org/10.1007/s4086401600313
Tudisco F, Arrigo F, Gautier A (2018) Node and layer eigenvector centralities for multiplex networks. SIAM J Appl Math 78(2):853–876. https://doi.org/10.1137/17M1137668
Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416. https://doi.org/10.1007/s112220079033z
Wang K, Fu X (2017) Research on centrality of urban transport network nodes. In: AIP conference proceedings, vol 1839, p 020181. AIP Publishing LLC. https://doi.org/10.1063/1.4982546
Wang D, Wang H, Zou X (2017) Identifying key nodes in multilayer networks based on tensor decomposition. Chaos Interdiscip J Nonlinear Sci 27(6):063108. https://doi.org/10.1063/1.4985185
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘smallworld’ networks. Nature 393(6684):440–442. https://doi.org/10.1038/30918
Wu M, He S, Zhang Y, Chen J, Sun Y, Liu YY, Zhang J, Poor HV (2019) A tensorbased framework for studying eigenvector multicentrality in multilayer networks. Proc Natl Acad Sci 116(31):15407–15413. https://doi.org/10.1073/pnas.1801378116
Zheng Z, Huang Z, Zhang F, Wang P (2018) Understanding coupling dynamics of public transportation networks. EPJ Data Sci 7:1–16. https://doi.org/10.1140/epjds/s1368801801486
Acknowledgements
We thank Peter Bernd Oehme for his support in processing the GTFS data and programming basic python routines. We thank Geoff Boeing for pointing us into the right direction for reproducing street network orientations using OSMnx.
Funding
Open Access funding enabled and organized by Projekt DEAL. The publication of this article was funded by Chemnitz University of Technology.
Author information
Authors and Affiliations
Contributions
MS conceived and initiated the study. KB implemented the methods, conducted the analysis, and took the lead on writing the manuscript. Both authors contributed to the interpretation of the results and the completion of the manuscript. Both authors read and approved the submitted version of the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Consent for publication
The authors agree to the publication of the present manuscript in Applied Network Science in its current form.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bergermann, K., Stoll, M. Orientations and matrix functionbased centralities in multiplex network analysis of urban public transport. Appl Netw Sci 6, 90 (2021). https://doi.org/10.1007/s41109021004299
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41109021004299
Keywords
 Multiplex networks
 Urban systems
 Public transport
 Network orientation
 Matrix functionbased centralities
Mathematics Subject Classification
 05C50
 05C82
 15A16
 65F60
 91D10
 94C15