The structure and behaviour of hierarchical infrastructure networks

Introduction Infrastructure networks are critical to the functioning of modern societies, with a significant and growing dependence on them for everyday activities and quality of life (Boin and McConnell 2007; Cabinet Office 2010; Sterbenz et al. 2011). Despite this, they are often found to be vulnerable to failures, with numerous incidents reported which have led to widespread disruption at large economic costs (Rinaldi et al. 2001; Andersson et al. 2005; Royal Academy of Engineering 2014; OFCOM 2018; OFWAT 2018; National Infrastructure Commission 2020; OFGEM 2020). The vulnerability of critical infrastructure networks has been an important research topic for many years and, despite many efforts to try and improve the resilience and robustness of these infrastructure systems to disruptions, large scale failures still occur. Better understanding the structure of such Abstract

modern infrastructure networks is critical to learning their behavioural characteristics and thus ensuring their continued operation during disruptive events.
Real world infrastructure systems can be represented using network models, built fundamentally from a set of nodes (vertices) and edges (links), combining to create a graph or network (Newman 2003b;Boccaletti et al. 2006) with different characteristics dependent on the way these are connected. The desire to gain a greater understanding of real world networks led to the discovery of a power-law in the degree distribution of many real world networks, knowledge which evolved into the creation of more realistic graph models such as the small-world model by (Watts and Strogatz 1998) and the scalefree model by (Barabasi and Albert 1999). The scale-free model is characterised by a low average path length and a high clustering coefficient, metrics which are reliant on the topological structure of the network. Real world networks, however, should be characterised not only on their topology but also on how they function in terms of the passage of data or goods over the network (Luca et al. 2006). This can result in a different set of characterisation methods being required, with the potential for new network organisational features to be identified.
In the decade or so following the introduction of the small-world and scale-free models, much analysis has been done comparing these models to real world systems (Bassett and Bullmore 2006), with many finding similarities between networks and either model (Amaral et al. 2000;Jeong et al. 2000;Latora and Marchiori 2002;Bassett and Bullmore 2006). Further research, in part due to increased availability of datasets and greater computational power (Boccaletti et al. 2006), has shown a hierarchical organisation exists in certain networks, with vertical levels of communities of nodes within the hierarchy that are dependent on a set of key parent nodes Caldarelli et al. 2004;Costa and Silva 2006;Costa et al. 2007;Shekhtman and Havlin 2018). A number of critical infrastructure networks have been regarded as hierarchical, such as road networks (Yerra and Levinson 2005) and other transport systems (Leu et al. 2010), with the later suggested to be represented by a tree network, a simple hierarchical structure (Ravasz et al. 2002;Costa and Silva 2006), similar to natually-occuring river networks (Dodds and Rothman 2000). Other networks inlcuding language, actor, food webs and metabolic systems Clauset et al. 2008;Ravasz et al. 2002) have all been found to display a hierarchical nature. It has since been shown that a hierarhical network structure can make a network more vulnerable to failures (Jenelius 2009), though the embedded community structure improves resilience at a community level (Shai et al. 2015;Shekhtman and Havlin 2018), such as found within the hierarchical communites model proposed by Ravasz and Barabasi (2003), to represent the structure of metabolic networks, a hierarchical model with communities of nodes embedded within. A study by Shai et al. (2015) using graphs with a 2-level hierarachy has suggested the community structure within a network can result in a hierarchical graph behaving like random graphs, though it has also been suggested that infrastrucutre networks can have a greater number of levels (Yerra and Levinson 2005).
These studies provide insights for particular networks but no broader analysis of the charachteristics of these hierarchical networks, or comparison of such networks against each other, or the original graph models (scale-free and small-world), has been presented. This paper will therefore present a more detailed comparison between these hierarachical networks and non-hierarchical examplars, including scale-free and smallworld graph types, which will facilitate a detailed investigation into not only the charachtertistics which make these classes of networks different, but also how these affect their robustness to failures. Using a suite of 42 real world spatial infrastructure networks, the presence of hierarchically-structured infrastrcutre networks will also be explored, along with the trade-offs between hierarchical and non-hierarchical network strcutures including the effect on the robustness to perturbations.

Method
Two suites of networks were used to explore the extent of similarities between synthetic graph models and real world infrastructure networks; a synthetic network suite and a suite of real world infrastructure examples. Through the inclusion of synthetic hierarchical models, the characteristics of hierarchical networks are learned and applied to the suite of real world infrastructure networks. The robustness of both suites to perturbations is analysed to provide an insight into the response of hierarchical networks compared to non-hierarchical networks.

Synthetic network suite
To generate a clear understanding of the characteristics of existing and emerging graph models, a suite of eight graph models has been generated, culminating in a total of 6038 graphs. The suite extends from networks generated with random graph models (2000 realisations), through scale-free (1000 realisations) and small-world models (1000 realisations), to hierarchical models (2038 realisations). These have been generated using a combination of previously-developed algorithms to create a range of common graph models including the Erdos-Renyi (ER) model (Erdos and Renyi 1959) for random networks, the Watts-Strogatz (WS) (Watts and Strogatz 1998) model for small-world networks, the Barabasi-Albert (BA) model (Barabasi and Albert 1999) for scale-free networks, and the balanced tree model for Tree networks (Fig. 1a); all available in the employed python NetworkX library (NetworkX 2020). For hierarchical networks, three further algorithms have been implemented. Firstly to capture the presence Fig. 1 a A connected tree network of seven nodes. This vulnerability of the tree graph can be seen if we consider the graph with the top most node removed; resulting in the graph fragmenting into two components. b A hierarchical community network, with each of the communities connected to the central node in the inner community (Ravasz and Barabasi, 2003) of community structure as found within some hierarchical networks (Ravasz et al. 2002;Yerra and Levinson 2005;Shai et al. 2015;Shekhtman and Havlin 2018), a model devised by Ravasz et al. (2002) and Ravasz and Barabasi (2003) (Fig. 1b) has been implemented. This captures the greater complexities and hierarchical levels compared to the stochastic block model used Shekhtman and Havlin (2018) and that used by Shai et al. (2015) as mentioned earlier which are limited in their number of hierarchical levels. Two further models, variations on the tree model, known as the HR and HR+ models have been developed for this research, with these generating a less structured tree network through the addition of edges to the structure, with the HR model adding edges randomly between nodes and the HR+ model only adding the extra edges between nodes in the same or adjacent levels of the hierarchical tree structure. In both cases, new edges can neither be self-loops or duplicate existing edges. The number of new edges to add is calculated using Eq. 1, where E is number of nodes and p a multiplier between 0 and 1.
A limit of 2000 nodes has been set for the node count of each graph realisation ( N = 2000 ), with realisations from 2 nodes to 2000 being generated so a clear variation on graph structure and complexity is seen within the resultant ensemble. For each graph model 1000 realisations are generated where possible. Those graphs with a more defined structure (such as the Tree and HC models) have no random exponent in their generation, so a limited number of realisations is possible within the set constraints. As a result there 1000 realisations of each model except for the HC model, which has 7 realisations, and the Tree model, which has 31. Examples of each of the models are shown in Fig. 2, where (a) to (d) are to be referred to as non-hierarchical, and (e) to (h) as hierarchical as they all have a tree structure.
These synthetic networks are referred to as 'graphs' in the remainder of the paper.

Real world infrastructure networks
A suite of real world infrastructure networks has been employed covering sectors including road, rail, air, electricity, gas and rivers (Table 1). Spatial network models ( Fig. 3) have been developed using a range of geographic datasets for each sector with a suite of spatial tools (see Barr et al. (2013) for more details) used to generate topologically-valid networks and correct any errors (overshoots and undershoots at intersections). These real world examples are referred to as 'networks' in the remainder of the paper. Seven road networks have been generated using the Ordnance Survey Meridian 2 vector dataset covering Great Britain, with more detailed subsets for cities/regions including Tyne and Wear, Leeds, and Milton Keynes. Versions have been generated with different road classes; one only includes motorways, and one includes motorways, A-roads, B-roads and minor roads. Open Street Map (Open Street Map 2012) has also been used to generate road networks covering Ireland, with two versions; one with motorways and trunk roads ( N = 3521 ), and a second with primary roads added ( N = 4444 ). All road networks have been created with nodes with a degree of two removed and the edges dissolved giving a representative topology of the network.
(1) E add = E × p Fig. 2 The topology of the eight graph models in the suite of synthetic networks, each with 15 nodes, except the HC model (g) which has 16, shown using a circular layout Rail networks have been generated for the UK from the Ordnance Survey Meridian 2 vector dataset, with nine networks created. As well as a national network, smaller light rail networks have been created for suburban systems including the Tyne and Wear Metro ( N = 60 ), Manchester Metrolink ( N = 65 ) and those overseen by Transport for London (TfL) as a composite network ( N = 399 ). Open Street Map data has also been used to generate 7 networks for a range of other rail systems outside of Great Britain, including Ireland ( N = 201 ) and light rail in the cities of Boston (USA) ( N = 120 ) and Paris (France). These have been validated using system maps freely available online from the respective operators.
Six air networks have been generated using data available from OpenFlights (Open Flights 2012), as used in previous studies (Wilkinson et al. 2012;Verma et al. 2014). This set of networks includes those for Europe ( N = 643 ), the UK ( N = 48 ), the USA ( N = 601 ) and North America ( N = 889 ) as well as operator networks, for British Airways ( N = 198 ) and EasyJet ( N = 125).
Energy networks are also represented, with data from the National Grid for England and Wales used to build three versions of the network with varying levels of detail, from N = 23, 787 to N = 2218 . A synthetically-generated distribution network has also been used (ITRC 2013), to create a transmission and distribution network ( N = 170, 667 ). The national gas network for Great Britain ( N = 1486 ) is also included.
Finally a network representation of the JANET network (Jisc 2015), a network providing high speed digital connections for UK academic institutions, has been created ( N = 38).

Quantifying network structure
Network structure has traditionally been characterised from a topological perspective using metrics such as the degree distribution, clustering coefficient and the average shortest path length (Watts and Strogatz 1998;Barabasi and Albert 1999;Newman 2003b). These metrics have also been used as reference for the subsequent development of graph models such as those for graphs with small-world (Watts and Strogatz 1998) and scale-free (Barabasi and Albert 1999) characteristics. Higher level metrics, such as those beyond the traditionally-used measures above, are more relevant to the functioning of infrastructure networks as they may tell us more information about a network's behaviour and organisation, potentially helping to characterise real world networks with greater realism (Rozenfeld et al. 2005). For example, statistics about the cycles found in a graph are shown by Rozenfeld et al. (2005) to highlight useful properties of networks with regard to their connectivity which are otherwise missed by traditional metrics, such as the degree distribution. The continued development of graph algorithms alongside greater computational resources also allows more complex metrics to be computed over large graphs than has been previously possible. These advancements allow for the dynamics of a network to be considered, and not just the topological structure, key when considering infrastructure networks (Luca et al. 2006). For the purpose of this research, three higher-level metrics, detailed in the following paragraphs, have been chosen due to their applicability for characterising graphs/networks.
Characterising how a network may function, or how its structure allows movement and flows to pass over it, provides an insight to how it may operate in delivering a service. Betweenness centrality, first introduced by Freeman (1978), is a measure of the proportion of all shortest paths which pass through each node, providing details on how well connected the network might be (Barthélemy 2004). This measure also provides an insight into the way flows of information or goods may pass over the network (Crucitti et al. 2006;Barthelemy 2011). Betweenness centrality is defined by Brandes (2001) as: where V is the set of nodes, v being the node of interest, σ (s, t|v) being the number of shortest paths through node v and σ (s, t) the number of shortest paths in the network. The value for each node, between zero and one, is an indicator of the importance of the in the connectivity of the networks, and although the distribution of these values across the network is a useful indicator of how central nodes are to a network (Crucitti et al. 2006), the maximum value G max C B (Eq. 3) alone provides an insight into the connectivity of the graph. Where the maximum value is high, tending close to one, the network can be expected to be reliant on a single node for the network to remain connected, with a high proportion of the shortest paths passing through it. Where the value is close to zero, this suggests that there is much more even distribution of shortest paths across the network indicating a better-connected structure.
Both the number and distribution of cycles has been used previously as a higherlevel metric to identify structural characteristics of graphs or networks, allowing new insights to be learned (Watts and Strogatz 1998;Rozenfeld et al. 2005;Costa et al. 2007). A cycle is defined as a path formed of a set of edges, which starts and finishes at the same node, but does not visit the same node or edge more than once (Dolan and Aldous 1993;Caldarelli et al. 2004). Indicating a connected set of nodes, the greater the number of cycles the better connected the graph is likely to be. A tree network, an example of poorly connected graph, has no cycles (Albert and Barabasi 2002). The length and propensity of cycles can give an indication of the characteristics of a graph, with a greater frequency of short cycles alone suggesting a better connected graph, while longer cycles suggest a lesser connected topological structure (Lind et al. 2005;Klemm and Stadler 2006). In this paper we adopt the measure of cycle basis, the fundamental set of cycles from which all cycles can be formed (Paton 1969;Diestel and Kühn 2004), to characterise the connectedness of the graph accounting for cycles of all lengths. To accommodate for the different cardinality of graphs or networks, the count of the number of cycle basis ( CB ) is normalised by the number of nodes ( N ), to give a value for comparison across all graphs ( CB ), Eq. 4.
The final metric used is the assortativity coefficient, a global measure of the similarity of the degree of neighbouring nodes in a graph, characterising how the network is connected at a neighbourhood level (Newman 2003a). This is defined by Newman (2003a) as where e ij is the fraction of all edges which join nodes with degree x and degree y, a i and b j are the fraction of edges that start and end at nodes with degree x and y respectively and σ a and σ b are the standard deviation of the distributions a i and b j . A returned value, between − 1 and 1, which is closer to one indicates an assortative network where nodes are connected to those of a similar degree, indicative of a regular structure, compared to those near negative one which suggest a hub and spoke structure (Newman 2003a). The ability of the metric to distinguish the characteristics of the underlying structure also allows it to be applied as an indicator of network robustness (Newman 2002;Foster et al. 2010).
The three metrics described above (the maximum betweenness centrality, the number of cycle basis and the assortativity coefficient) give insights into the characteristics of networks from different perspectives, providing measures of how well connected a graph is, how the structure of the graph affects flows over it and how robust the graph may be to perturbations. It is important to select those metrics which are important to answering the aim of this work (Newman 2003b), and this combination of higher-level metrics provide more details on the structure, characteristics, and potential behaviour than possible through some more traditional methods, such as the aforementioned degree distribution and shortest average path length, while also providing other insights, and thus we use these three metrics alone for our analysis.
Following the computation of the metrics described, the similarity of the distribution of the values for each graph type was statistically tested using the multivariate transformed divergence statistic which assesses the amount of overlap between the distributions of the values. Singh (1984) defines this as where i and j are the two data classes, C i is the covariance matrix of i, u i the mean vector of i, tr the trace function and T the transposition function. Where TD = 100, the distributions are statistically different, where TD = 0, the distributions are statistically identical. Typically, where TD ≥ 85, the distributions are said to be statistically different (Swain and Davis 1978).

Modelling network robustness
To further understand the characteristics of the suite of synthetic and infrastructure networks, their robustness to failure is explored to give further insights into their topological structure. Using common iterative failure methods, such as those employed by Albert et al. (2000) and Holme et al. (2002) where at each iteration a node is selected to be removed, the ability of each synthetic graph type to topologically withstand the perturbations can be compared, along with the responses from the infrastructure networks. Three common node selection methodologies are used Holme et al. 2002;Luca et al. 2006;Lordan et al. 2014); a random node selection method where at each iteration a node is randomly selected to be removed; a node degree-based method where the node with the greatest degree (or one with the joint greatest degree) is removed at each iteration; and the third where the node with the greatest betweenness centrality is removed at each iteration. Please see Albert et al. (2000), Holme et al. (2002) or Luca et al. (2006) for more information. For the latter two methodologies the values are recalculated at each iteration to ensure the greatest impact (Luca et al. 2006), and in all three cases the graphs/networks are perturbed until E = 0.
To explore the characteristics of both the graphs and networks, an ensemble analysis is performed where five simulations for each failure methodology are performed for every graph or network to account for the random exponent in the failure models; the random selection of a node in the random model, or the random selection of a joint highest rated node in the other two models. A total of fifteen simulations are therefore run for each graph or network, where the performance is recorded using graph metrics, averaged across the five simulations.
The failure of the graph or network is measured with respect to how quickly it becomes disconnected and fragments into subgraphs or communities of nodes. Therefore, the more robust a graph is the fewer subgraphs would be expected to form. Other measures could be considered, such as the size of the giant component (the largest connected subgraph) (Holme et al. 2002), as an alternative measure of the effect perturbations have on a graph/network Holme et al. 2002). This measure, however, only provides an indicator as to how large the largest component of the network remains, and does not give an indication as to the state of the remaining aspects of the network. In an infrastructure context, any connected components outside the giant component could still provide a service to users and thus should not be ignored entirely from an analysis of the resilience of a network. Where there is a high number of subgraphs, it suggests the network has fragmented into many small subgraphs, where these may remain of use to geographical areas local to such parts of the graph or network.

Graph metric comparisons
The three graph metrics described earlier were calculated for each of the graphs and networks in the study. The results show that there is distinguishable difference in metric values between the synthetic hierarchical and non-hierarchical graph models (Fig. 4). This was most apparent when the distributions of the assortativity coefficient (AC) were plotted against the maximum betweenness centrality (MBC). Here the single standard deviation ellipses for each of the graph models for the distribution of the metric values indicate a clear separation between the hierarchical and non-hierarchical models (Fig. 4a). This is also shown statistically, where there is also no statistical similarity in the distributions of the metric values for the AC/MBC for each pair-wise combination of graph models ( Table 2). The same pattern and statistical separation between the nonhierarchical and hierarchical models is also shown through the distributions for the other two metric combinations though with less clarity.
With the AC-MBC distributions the hierarchical models, AC≤ − 0.17 and MBC ≥ 0.25, are distinct from the non-hierarchical, − 0.04 ≤ AC ≤ 0.0 and 0.01 ≤ MBC ≤ 0.08, with the TREE and HC models especially shown to disparate in structure,MBC ≥ 0.5. This relationship can be explained by the number of cycle basis per nodes (CB) present in the different networks, with a tendency to be more prevalent in non-hierarchical graphs, CB > 7 , than hierarchical graphs, 0 > CB < 1 , where a low CB count suggests a greater MBC value (Fig. 4c). Due to the lower number of CB, there are fewer connections and thus a greater dependency on those nodes with connections, generating a greater MBC value. This can be seen with the values returned for the TREE and HC models where CB < 2 and MBC = 0.765. The effect of the extra edges being added to the TREE graph for the HR and HR+ models reduces the CB value, CB < 1, and therefore the resultant MBC value.
The statistical relationship between the hierarchical and non-hierarchical model groups is disparate (Table 2), with the pair-wise relationships between these two groups all returning values over 97, when a value above 75 indicates no similarity in the distributions. A small number of the relationships between graph models, such as between the hierarchical HR and HR+ models, show similarity for all three metric distributions (39.84, 27.34, 6.53, indicated by † in Table 2) from which we can infer a degree of similarity in the characteristics of the models. This relationship was expected as both models have their origins in the TREE model and the generation algorithms only varying slightly. The other greatest similarity between graph models is found to be for the non-hierarchical BA-WS models, returning a set of values (14.13, 43.86, 43.98) indicating a clear similarity when considering the MBC-AC metric distribution, but less so for the alternative two distributions. Both models, one with a small-world (WS) structure and the other scale-free (BA), are regarded to generate graphs with different characteristics, however these results suggest a degree of similarity when using the MBC and AC metrics.
For the same metrics, the suite of infrastructure networks have also been analysed and plotted against the single standard deviation ellipses of the graph models for context and comparison (Fig. 4). For all three metric distributions, the infrastructure networks exhibit a tendency to lie in or near the standard deviation ellipse for the hierarchical graphs. Of the infrastructure networks examined, the river networks exhibit values closest to the graphs with a hierarchical organisation, with the rail networks also showing a likeness to the hierarchical models, especially for the AC-CB (Fig. 4b) and MBC-CB (Fig. 4c) distributions. The energy networks are shown to return values throughout the three distributions between those expected of a tree network and those of a random network. The values, however, are more suggestive of a hierarchical structure, with values most similar to the HR and HR+ models. Similarly, the road networks exhibit metric characteristics which are most like those of the HR and HR+ models, again suggesting they have a hierarchical structure. The air networks, conversely, exhibit a different set of characteristics, with the AC-MBC distribution being similar to the hierarchical graphs, while appearing as outliers for the AC-CB and MBC-CB distributions and not lying close to any of the graph model or other infrastructure networks analysed. This could be a result of these networks not being constrained by geography, with no limit on the number of connections a single node can have, resulting in highly disproportionate node degrees in comparison to other networks and not captured by the employed models.

Graph failure behaviour
The greater robustness of the non-hierarchical graph models when compared to the hierarchical graph models is shown (Fig. 5a), where nodes are removed at random sequentially (one after the other) (see Sect. 2.4 for method details). The non-hierarchical models fail following the removal, on average, of 94% of the nodes, whereas the mean for the hierarchical models is 76%, failing 19% quicker. The results from both targeted failure mechanisms (node degree and node betweenness centrality) show thy have a similar impact (Fig. 5b, c), with in both scenarios the hierarchical models failing on average after Fig. 5 Percentage of nodes removed from the eight graph models realisations for the Random (a), degree (b) and betweenness (c) failure methods for the networks to become disconnected 54% of nodes have been removed and the non-hierarchical models once 80% of nodes have been removed. Through all failure methods the HC graph model exhibits a different behaviour to the other hierarchical models, failing 18% slower for the random strategy and 48% slower for the targeted strategies. This greater robustness to perturbations results in the HC graph model appearing just as robust as the non-hierarchical models, suggesting it possess a hierarchical structure yet behaves more like the non-hierarchical models.
The behaviour of the graphs generated by each graph model when exposed to the three failure models (Figs. 6, 7 and 8) indicates that the non-hierarchical models (plots (a), (b), (c) and (d)) fragment into multiple components (y-axis) much later than the hierarchical models. Failure for non-hierarchical models generally doesn't occur until around 50% of the nodes have been removed (where the x-axis shows the % of nodes removed, from 0 to 100), irrespective of the failure model. The behaviour of the random models (plots (a) and (b)) exhibit a similar behaviour to all failure models with regard to when they start to fragment, though the number of components they then fragment into is much higher for the two targeted methods, degree and betweenness (Figs. 7 and 8), increases from 20 and 40 to ~ 140 and ~ 210 for degree and betweenness failure methods respectively. The behaviour of the other two non-hierarchical models, WS and BA (plots (c) and (d)), Fig. 6 The response of the eight graph models to the random node selection failure model method. Each response is plotted with the y-axis showing the number of components and the x-axis showing the % of nodes removed from the graph, and thus the most robust graphs would show a small number of components, with any increase shown to the right of plots as x tends away from 0. Plots (a)-(d) show clear increase in the number of components to the right of the plots suggesting a robust response, whereas plots (e)-(h) show peaks much closer to the left and centre of the plots, suggesting a less robust response shows a more pronounced change, with the graphs not only starting to fragment earlier to the targeted methods, but also fragmenting into more components, from a maximum around 140 for the random failure model to 270 and 350 respectively for the two targeted models.
It is clear that the hierarchical models, plots (e), (f ), (g) and (h), are more vulnerable to failures than the non-hierarchical models, as indicated by the number of components increasing much earlier when exposed to any of the three failure models. The tree model (plot (h)) starts to fragment after < 5% of nodes have been removed, with the peak in the number of components appearing after ~40% and ~10% of nodes have been removed for the random and targeted failure models respectively. The tree model graphs also fail much quicker with no components remaining left in the graphs quicker than any of the other models analysed. The hierarchical HR/HR+ models exhibit a similar behaviour with these also fragmenting as soon as a few nodes are removed, though are more robust to the failure models with the peak for the number of components being later (45-55% and 15-25% for the random targeted failure models) and for graphs remaining with some components for longer. Fig. 7 The response of the eight graph models to the degree based node selection failure model method. Each response is plotted with the y-axis showing the number of components and the x-axis showing the % of nodes removed from the graph, and thus the most robust graphs would show a small number of components, with any increase shown to the right of plots as x tends away from 0. Plots (a)-(d) show clear increase in the number of components to the right of the plots suggesting a robust response, with a small number of exceptions in (c) and (d) where the targeted analysis results in a small number of 1000 graphs failing much more quickly. The other four plots (e), (f) and (h), show peaks much closer to the left of the plots, suggesting a less robust response, and plot (g) shows a unique response pattern, suggesting a greater robustness than in (a)-(d), but more than shown in (e), (f) and (h) Of significance is the behaviour exhibited by the HC graph model (plot (g)), which exhibits a very different behaviour to the other hierarchical models. For the random failure model the HC graphs fragment differently, with all but one simulation for the model showing it to be more robust with fewer components forming and no large increase as seen in the other models. However, for the targeted failure models the graphs instead fragment immediately into a large number of components, but then unlike all the other models they then don't fail quickly and instead appear to be robust to further failures with these not causing further fragmentation of the graphs. After reaching a peak in the number of components the HC graphs don't fail at least until a further 50% of the nodes have been removed, whereas in the other hierarchical models this value is much closer to 20%.

Infrastructure network failure
The infrastructure networks are more robust to the random perturbation method (Fig. 9a) than the targeted methods, node degree and node betweenness centrality (Fig. 9b, c). Across all infrastructure sectors, for the random method the networks fail once 63% <> 78% of nodes have been removed, compared to 38% <> 62% for the targeted Fig. 8 The response of the eight graph models to the betweenness based node selection failure model method. Each response is plotted with the y-axis showing the number of components and the x-axis showing the % of nodes removed from the graph, and thus the most robust graphs would show a small number of components, with any increase shown to the right of plots as x tends away from 0. Plots (e), (f) and (h) show a lack of robustness to the node failures with peaks in component numbers to the very left of the plots, whereas the peaks are much further to the right in plots (a)-(d) indicating a slower rate of failure/ fragmentation of the graphs. Plot (g) shows a unique behaviour of immediate fragmentation, but appears to then show some resilience to further node failures methods. In the random method this sees a limited variation between the robustness of the different networks, with the river networks being the least robust, failing after 63% of nodes have been removed, with the mean across the 8 infrastructure sets being 71.1%. In contrast the air networks were the least robust when the targeted methods were applied, failing after 38% of nodes had been removed, with the next worst performing being the river networks, which failed after an average of 49% of nodes had been removed. The most robust infrastructure was the road networks (national and regional), returning a value of 57.6%, also similar to the rail networks (national and regional), 55.4%.
As with the graph models, we can also analyse not just how quickly the infrastructure networks failed, but also how the structure, with regard to the number of components, Fig. 9 Average percentage of nodes removed from infrastructure networks before failing during the random (a), degree (b) and betweenness (c) failure methods changed as the networks were perturbed. The behaviour of each infrastructure network is shown across the three failure models (Figs. 10,11,12), once again highlighting a greater vulnerability to the targeted failure models than the random failure model. Most of the infrastructure networks exhibit a peak in the number of components around 30.3-49.0% for the random failure model (Fig. 10), though the air networks (plot (a)) exhibit a peak much earlier, after 17.0% of nodes of have been removed. These same networks then however remain connected and do not completely fail until 79.6% of nodes have been removed, whereas the other infrastructure networks fail after 57.6-76.3% of nodes have been removed. This earlier peak in the number of components is also found for the targeted failure methods, degree (Fig. 11) and betweenness (Fig. 12) suggesting a greater vulnerability for failures in the flight networks. The ability of the air network to not fail completely, however, and continue to have some connected components until a greater proportion of nodes have been removed than in the other infrastructure networks, indicates a similar behaviour (and hence structure) to the HC graph model. The network fragments into communities, but these then are individually more resilient to failures than the network as a whole.
All infrastructure networks, with the exception of the air networks, exhibit a behaviour closer to the hierarchical models, and in particular, the HR+ model. For the random Fig. 10 Failure behaviour of the infrastructure networks under the random failure method. The y-axis shows the number of components and the x-axis shows the % of nodes removed from the graph, with a more robust network showing a smaller peak and towards the left of the plot failure model the peak number of components across the air networks occurs on averages when 39.8% of nodes have been removed, compared to 46.9% for the HR+ and 84.7% for the WS/BA models. With the targeted failure models this peak shifts to the left occurring after 26.0% of nodes have been removed, compared to 21.5% and 63.5% for the HR+ and BA/WS models respectively. This much greater similarity to the hierarchical graph models reflects the results given by the metric approach, highlighting once again the presence of a hierarchical structure in infrastructure networks.

Discussion
The results have clearly indicated that hierarchical networks are distinctive from nonhierarchical networks, both with regard to the structural characteristics as observed through the assortativity coefficient, maximum betweenness centrality and the number of cycle basis, as well through the response to perturbations. Statistically the hierarchical networks are different, with this being centred around a reliance on a single, or set of critical nodes, as denoted by the greater values for the maximum betweenness centrality metric, as well as a smaller set of cycle basis. These values, along with the assortativity coefficient, distinguish between the networks generated by the four non-hierarchical models and the four hierarchical models. It is also apparent, however, that those networks which are hierarchically structured are also more vulnerable to perturbations, with the hierarchical models failing (on average) 19% and 33% faster for the random and targeted methods respectively, with the later appearing to exacerbate the weakness of the hierarchical structure. This vulnerability, is a reliance on a single, or a small subset of nodes, as in the case of the scale-free and small-world networks when compared to the random networks (Albert and Barabasi 2002), which makes the networks vulnerable to removal of nodes. This is indicated through the greater maximum betweenness centrality in the hierarchical models, suggesting a significant reliance on a single node for the majority of shortest paths between all nodes in the network. As well this, the low number of cycle basis reduces the potential for there being alternative paths/connections across the network once key nodes are removed, causing the network to begin to quickly fragment.
Significantly, the hierarchical characteristics have been found to be prominent in the suite of 42 infrastructure networks, exhibiting values more closely related to the hierarchical models than the non-hierarchical. The infrastructure networks also behave, when perturbed, more similarly to the hierarchical graph models than the non-hierarchical, further confirming the greater similarity to the hierarchical models. Some of the infrastructure networks, such as the rivers and the air networks, Fig. 12 Failure behaviour of the infrastructure networks under the betweenness failure method. The y-axis shows the number of components and the x-axis shows the % of nodes removed from the graph, with a more robust network showing a smaller peak and towards the left of the plot exhibit a greater similarity to the hierarchical networks, including a lack of robustness to perturbations. This is caused by a reliance on key nodes, such as hub nodes in the case of air networks, which make these vulnerable to the removal of such nodes. This makes such networks extremely vulnerable to the failure of the most connected nodes, though due to a degree of redundancy, as shown by the number of cycle basis, the networks are less vulnerable to random node failures.
The lack of robustness exhibited across the suite of infrastructure networks clearly indicates a need to improve how infrastructure networks respond to perturbations. Of the suite of graph models analysed, the HC model, one of the four hierarchical models analysed, exhibited a unique robustness to failures whereby it is more robust than the other hierarchical models. The model has a community-based structure, where communities of nodes are linked hierarchically to form a hierarchical network, but with well-connected communities of nodes embedded within. As a result, when perturbed, the communities within the network become disconnected from each other, though remain connected within themselves. For infrastructure networks, this fragmentation behaviour could provide a beneficial structure which allows those, which are not dependent on a single core network, to function at a more local scale. This possible alternative hierarchical structure may not be suitable for all infrastructure systems, but raises the possibility of networks retaining a hierarchical structure at the same time as improving their robustness to perturbations.
Understanding the properties which influence the robustness of infrastructure networks can help infrastructure planners and managers assess their own networks to determine where interventions may be needed, in the form of a small number of additional links for example that could significantly improve robustness to failures. As the resilience of real world networks becomes even more important under a changing climate, the use of the graph metrics presented here could become a useful tool to analyse such networks and enable the most efficient use of resources.
Further work is required to identify further fundamental characteristics of hierarchical complex networks and their response to a spectrum of perturbations scenarios. This study has examined some of the characteristics that may help to identify and understand hierarchical networks and their behaviours. More work is required, however, to explore the extent of the role a hierarchical structure has in the characteristics of the response behaviour of both graphs and critical spatial infrastructure networks compared to other structural properties which may also have an effect, such as the community and modular structures explored by some studies (Ash and Newth 2007;Shai et al. 2015) that may influence how a network responds when perturbed. The ability for networks to be robust to failures is critical, though within an infrastructure context it must be recognised that networks do not always function in isolation, but instead are connected to one another, relying on each other to function creating dependencies and interdependencies (Rinaldi et al. 2001;Buldyrev et al. 2010;Gao et al. 2011;Reis et al. 2014;Goldbeck et al. 2019). Such links, for example the reliance on electricity for railways for power, and the reverse for the delivery of raw materials, create a new dimension when considering the robustness of networks. Interdependencies have been shown to impact the robustness of synthetic hierarchical networks negatively (Shekhtman and Havlin 2018), and therefore further work is required to understand in more detail the implications of such results on complex hierarchical spatial infrastructure networks, as identified in this study.

Conclusions
The characteristics of hierarchically organised networks have been explored, identifying the key measures which can be used to begin to distinguish between hierarchical and non-hierarchical networks. This has led to the recognition that many real world spatial critical infrastructure networks are hierarchically-structured, and as a result exhibit a weak robustness to perturbations, especially those which affect critical nodes within the network (e.g. hub nodes). More significantly, critical infrastructure networks appear share a greater similarity to the hierarchical models than the non-hierarchical models, including the inherent lack of robustness to perturbations. This study has also shown however that within those hierarchies where a community structure is present, such as in the HC model, there is a resilience where these communities are more robust than the network as a whole, allowing some degree of connectivity and potential functionality even when these communities may become disconnected from each other through the failure of the hierarchical structure. This improved understanding both highlights a significant weakness in the infrastructure networks which we rely upon, while also promising insights into the characteristics which would enable such infrastructure systems to be more robust to node failures.
This research has used a number of novel methods, including the use of both large suites of graphs and real word spatial infrastructure networks in the characterisation of hierarchies and the consequent impact on robustness to perturbations. Building further on this, future work will explore the ability to adapt existing infrastructure networks to improve their ability to withstand perturbations through adopting some the characteristics found within more robust network models such as the HC model. Further analysis will also explore the robustness of infrastructure networks to the explicit geographical hazards, and how the hierarchical structure of such networks affects their ability to continue to function.