 Research
 Open access
 Published:
Robustness of preferentialattachment graphs
Applied Network Science volume 8, Article number: 32 (2023)
Abstract
The widely used characterization of scalefree networks as “robustyetfragile” originates primarily from experiments on instances generated by preferential attachment. According to this characterization, scalefree networks are more robust against random failures but more fragile against targeted attacks when compared to random networks of the same size. Here, we consider a more appropriate baseline by requiring that the random networks match not only the size but also the inherent minimum degree of preferentialattachment networks they are compared with. Under this more equitable condition, we can (1) prove that random networks are almost surely robust against any vertex removal strategy and (2) show through extensive experiments that scalefree networks generated by preferential attachment are not particularly robust against random failures. Finally, we (3) add experiments demonstrating that preferentially attaching to wellconnected vertices does not enhance robustness at all.
Introduction
Although preferential attachment models can generate only a vanishing fraction of all scalefree networks (Petersen et al. 2016), many claims about the class of scalefree networks arise from experiments on preferentialattachment instances. One such claim, originating from Albert et al. (2000), is that scalefree networks have a “robustyetfragile” nature, i.e., compared to random networks of the same size, they are more robust against random failures, where vertices are removed uniformly at random, but more fragile against targeted attacks, where vertices with the highest initial degree are removed first (Doyle et al. 2005).
The common models for preferential attachment, including the one used in the experiments of Albert et al. (2000), yield instances with a constant average degree and a minimum degree of half that value. Because of their constant average degree, however, random networks of the same size are highly likely to contain isolated vertices (Erdős and Rényi 1959). As a first contribution, we prove that the robustness and connectivity of such random networks change significantly if they are required to have a minimum degree of at least k for any constant \(k\ge 3\). With this in mind, it seems more appropriate and natural to compare the robustness of preferentialattachment instances with random networks of the same size and the same minimum degree. To the best of our knowledge, this has not yet been done. We fill this gap with our second contribution through an extensive suite of experiments. Our experiments show, in the affirmative, that scalefree instances generated by preferential attachment are consistently more fragile than sizematching random networks whose minimum degree is at least as large. To put it more bluntly: We find that in an equitable setting, the networks generated by preferential attachment do not exhibit a “robustyetfragile” nature.
As our last contribution, we carry out experiments, which demonstrate that a preference for attachment to wellconnected others does not enhance robustness in any way. Our contributions can be formally summarized as follows:

1
For any constant \(k\ge 3\), almost all graphs with a constant average degree and a minimum degree of at least k are connected and provably robust.

2
Scalefree networks generated by preferential attachment are more fragile than sizematching random graphs whose minimum degree is at least as large.

3
Preferential attachment leads to lower robustness than random attachment.
Preliminaries
We use \(\mathbb {N}\) to denote the set of positive integers. We use \(\lfloor x\rceil\) to denote the integer closest to x, breaking the ties in favor of higher values. More precisely, \(\lfloor x \rceil = \lceil x\rceil\) if \(x\lfloor x\rfloor \ge 0.5\) and \(\lfloor x \rceil =\lfloor x \rfloor\) otherwise. When we say that a statement holds for large enough \(n\in \mathbb {N}\), there exists a constant \(n_0\in \mathbb {N}\) such that the statement holds for all n that are larger than \(n_0\). We say that a sequence of events \(\mathcal {A}_n\) holds almost surely if \(\lim _{n\rightarrow \infty }\textsf{Pr}[\mathcal {A}_n]=1\).
Graphs and degree sequences
In this paper, we consider only simple undirected graphs and use the terms graph and network interchangeably. A graph \(G=(V,E)\) consists of a set of vertices V and a set of edges \(E\subseteq \left( {\begin{array}{c}V\\ 2\end{array}}\right)\). If \(\{u, w\}\in E\), then u and w are said to be adjacent. A graph is called complete if each vertex is adjacent to all other vertices. The degree, \(\deg _G(v)\), of a vertex v is the number of vertices in G adjacent to v. If \(v_1,\cdots ,v_n\) is an ordering of V where \(\deg (v_1)\ge \dots \ge \deg (v_n)\), then \(D(G)=\left( \deg (v_1),\ldots ,\deg (v_n)\right)\) is the degree sequence of G. An integer sequence D is called graphical if there is a simple undirected graph G with \(D=D(G)\). We use \(\Delta (G)\) and \(\delta (G)\) to denote, respectively, the maximum and minimum vertex degree in G. The graph G is called kregular if \(\delta (G)=\Delta (G)=k\).
The subgraph of a graph \(G=(V,E)\) induced by \(V'\subseteq V\) is \(G[V']=(V',E')\), where \(E'=\left\{ \{u,w\}\in Eu,w\in V'\right\}\). Given a positive integer k, the kcore of a graph G is the inclusionmaximal induced subgraph, where all vertex degrees are at least k. The kcore of a graph is unique and can be determined efficiently (Batagelj and Zaveršnik 2011).
The reachability relation is defined as the reflexive and transitive closure of the adjacency relation. The connected components of a graph are its subgraphs induced by the equivalence classes of the reachability relation. A graph is called connected if it consists of a single connected component. The largest connected component, or LCC for short, is the one with the largest number of vertices.
Network robustness
The extent of invariance of a network structural property when elements of the network are removed is referred to as the robustness of that network (Klau and Weiskircher 2005). We focus only on the removal of vertices and consider the number of vertices in the largest connected component as the structural property of interest.
Given a connected graph \(G=(V,E)\) and the sequence of vertices \(B=(b_1, b_2, \dots ,b_T)\) in order of their removal, we can quantify the robustness of G by
This robustness score originally proposed in Hasheminezhad and Brandes (2022) is a generalization of the score used in Schneider et al. (2011), where in the latter score, B is a permutation of V. Note that the above score captures the relative size of the largest connected component and the rate at which it shrinks when the vertices are removed. The most commonly considered vertex removal strategies in the literature are random failures and targeted attacks. In random failures, the vertices are removed uniformly at random. In targeted attacks, the vertices with the highest initial degree are removed first. If the vertex removal strategy is clear from the context, and we accept random variation in vertex selection due to the tiebreaking rules, we can parameterize the robustness score by the fraction \(\beta\) of removed vertices rather than by the precise sequence.
Note that the complete graph has the highest robustness among all nvertex connected graphs, as it remains connected through any vertex removal process. For such a graph, \(R_G(B)\) is given by \(1\frac{B+1}{2n}\), which simplifies to \(1\beta /2+o(1)\) if the fraction of removed vertices is \(\beta\), i.e., when \(B=\beta n\). It follows that \(R_G(\beta )\) is upperbounded by \(1\beta /2+o(1)\) for any connected graph G and any vertex removal strategy (Hasheminezhad et al. 2020).
The (vertex) isoperimetric number h(G) and the conductance \(\Phi (G)\) are invariants closely related to the robustness of a graph \(G=(V, E)\). The former is defined as \(h(G)=\min _{\emptyset \ne S\subset V, S\le \frac{V}{2}}\left\{ \frac{\partial S}{S}\right\}\) where \(\partial S\) is the subset of vertices in \(V{\setminus } S\) that are adjacent to at least one vertex in S. Similarly, \(\Phi (G)\) is defined as \(\Phi (G)=\min _{\emptyset \ne S\subset V, S\le \frac{V}{2}}\left\{ \frac{E(S,V{{\setminus }} S)}{\textrm{vol}(S)}\right\}\) where \(\textrm{vol}(S)=\sum _{v\in S}\deg _G(v)\) and \(E(S,V{{\setminus }} S)\) is the subset of edges in E with one endpoint in S and the other in \(V{\setminus } S\). It is well known that \(h(G)\ge \frac{\delta (G)}{\Delta (G)}\Phi (G)\) (see FACT A.1. in Giakkoupis and Sauerwald 2012 for a brief proof of this).
Network models
The set of simple graphs with n vertices and m edges is denoted by G(n, m), and G(n, m, k) is the subset of graphs in G(n, m) that have a minimum degree of at least k. The models \(\textsf {G}(n,m)\) and \(\textsf {G}(n,m,k)\) consist of the uniform distribution on G(n, m) and G(n, m, k), respectively.
Scalefree networks are those networks in which the fraction of vertices with degree k is roughly proportional to \(k^{\gamma }\) for some \(\gamma > 1\). Since its popularization by Barabási and Albert (1999), preferential attachment has been the most widely used mechanism for generating scalefree networks. Although there are several instantiations of the same general idea, here we adhere to the approach used in the original robustness experiments of Albert et al. (2000). This generative preferentialattachment model \(\textsf {PA}(n,k)\) starts from a complete graph with \(2k+1\) vertices and successively adds \(n(2k+1)\) vertices. Each newly added vertex is made adjacent to k distinct vertices, drawn without replacement from the pool of existing vertices and with a probability that is proportional to their current degree. All graphs generated in this way are connected, have a minimum degree of k, \(m=kn\) edges (an average degree of 2k), and are thus elements of G(n, nk, k).
To assess their relative robustness, preferential attachment graphs are often pitted against random sizematching graphs drawn from \(\textsf {G}(n,m)\). If the number of edges is bounded linearly and away from one, i.e., \(m=kn\) for some constant \(k>1\), random graphs are unlikely to be connected, but they almost surely have a unique giant component (Erdős and Rényi 1959; Molloy and Reed 1998). This is also acknowledged in the robustness experiments of Albert et al. (2000), where rejection sampling is used to find a graph with a large enough largest connected component. So, while size and connectivity are largely kept constant in experiments on network robustness, the minimumdegree property of preferentialattachment graphs has not been considered. In “Theory” section, we show that this can be expected to have a major influence on what can reasonably be considered robust.
Network generation
We construct \(\textsf {PA}(n,k)\) and \(\textsf {G}(n,m)\) graphs using lineartime algorithms (Batagelj and Brandes 2005). In this section, we provide the details of the algorithm we use to sample from \(\textsf {G}(n,m,k)\). To this end, we first introduce a basic approach in “The straightforward approach” section and explain why this method was not utilized. Subsequently, we present a slightly more sophisticated method in “The efficient approach” section, which we use to efficiently draw samples from \(\textsf {G}(n,m,k)\).
The straightforward approach
One way to generate a graph from \(\textsf {G}(n,m)\) is to use a variant of the Bollobás configuration model (Bollobás 1980), also known as the random sequence model. This model generates a sequence \((x_1, x_2, \dots , x_{2m})\) in which the elements are sampled uniformly at random from the set of vertices \(V=\{v_1, \dots , v_n\}\) and the edges consist of \(E=\left\{ \{x_i, x_{i+1}\}i\in \{1,2,\dots , m\}\right\}\). Conditioned on the resulting graph being simple (without duplicate edges and selfloops), the generated graphs are sampled exactly uniformly from G(n, m). This procedure can also be viewed as a balls and bins process, explained below.

1
Let \(e_1, \dots , e_m\) represent the edges to be added. Let \(e_{t,1}, e_{t,2}\) represent the halfedges (also known as stubs) corresponding to \(e_t\), where \(t\in \{1,\dots ,m\}\).

2
Assign halfedges to the n vertices \(v_1,..., v_n\) uniformly at random.

3
If \(e_{t,1}\) is assigned to \(v_i\) and \(e_{t,2}\) to \(v_j\), then add an edge between \(v_i\) and \(v_j\).

4
If the resulting graph is simple, output it; otherwise, restart from step one.
To modify this procedure for sampling from \(\textsf {G}(n,m,k)\), it suffices to modify the second step above by randomly assigning halfedges to the n vertices, with the additional condition that each vertex has at least k halfedges assigned. If \(2m=cn\) and \(c\ge k\ge 1\) for some constants c and k, the probability that the resulting graph is simple in the last step is \(\Omega (1)\) (Bollobás et al. 2000). Therefore, this approach requires O(n) time in expectation to sample from \(\textsf {G}(n,m, k)\). Although this approach is straightforward and theoretically efficient, based on our implementation, it is quite inefficient in practice. This is likely due to the large constants hidden by the asymptotic notation.
The efficient approach
Here, we present a detailed exposition of an algorithm that leverages a theoretical result from Janson and Luczak (2008) to efficiently sample from \(\textsf {G}(n,nk,k)\), where \(k\ge 3\) is some constant. Consider a graph G, which is the kcore of a larger graph drawn from \(\textsf {G}(\tilde{n},\tilde{m})\). The kcore may be either empty or have a distribution \(\textsf {G}(\upsilon , \mu , k)\) with \(\upsilon\) and \(\mu\) denoting the (random) number of vertices and edges in the kcore, respectively (Anastos and Frieze 2020). In case we condition on \(\upsilon =n\) and \(\mu =m\), the distribution of G is equivalent to \(\textsf {G}(n, m, k)\). In Algorithm 1, we exploit the last observation to efficiently sample from \(\textsf {G}(n,nk,k)\) by using rejection sampling and selecting appropriate values of \(\tilde{n}\) and \(\tilde{m}\) based on Theorem 1 (Theorem 1.1 of Janson and Luczak (2008)) such that the event \(\upsilon =n\) and \(\mu =nk\) occurs almost surely. To state Theorem 1, we define \(\psi _k(x):=1e^{x}\sum _{i=1}^{k1}\frac{x^i}{i!}\) and \(c_k:=\min _{x>0}\phi _k(x)\), where \(\phi _k(x):=\frac{x}{\psi _{k1}(x)}\). These functions are essential in characterizing the threshold for the emergence of kcores in random graphs.
Theorem 1
(Janson and Luczak 2008) Let G be a graph drawn from \(\textsf {G}(n,m)\) where \(2\,m=cn\) for some constant c. Furthermore, let \(k\ge 3\) be a constant and assume that n is large enough. If \(c < c_k\), then the kcore of G is almost surely empty. On the other hand, if \(c > c_k\), then the kcore of G almost surely contains \(\psi _{k}(x^*)n\) vertices and \(\frac{1}{2}x^*\psi _{k1}(x^*)n\) edges, where \(x^*\) is the largest among the two unique solutions of \(\phi _k(x)=c\).
Note that we use the bisection method to efficiently attain the value of x on line 1 of Algorithm 1 utilizing the fact that f(x), as defined in Algorithm 1, is monotonically decreasing and has a root in (k, 2k), based on Lemma A1 in Bollobás et al. (2000).
In this section, we proposed Algorithm 1 to sample uniformly from G(n, nk, k) based on Theorem 1 for any constant \(k\ge 3\). However, the algorithm lacks explicit verification of certain conditions of Theorem 1, for instance, whether \(\phi _k(x)=c>c_k\) for \(k\ge 3\) when x is selected as in Algorithm 1. Therefore, we could not establish probabilistic bounds on the algorithm’s running time or theoretically guarantee that the algorithm terminates at all. Nevertheless, for any constant \(k\ge 3\), the algorithm samples exactly uniformly from G(n, nk, k) conditioned on successful termination, as implied by the argument presented at the beginning of the section. Our empirical evaluations of the algorithm demonstrate that its termination time is reasonable for the range of k studied in our experiments.
Theory
Under the condition that the minimum degree is at least k for a constant \(k\ge 3\), we show in Theorem 2 that almost all graphs with a constant average degree are not only connected but also provably robust against any vertex removal strategy.
In the following paragraph, we briefly outline the key intermediary results and how they are used to lead up to Theorem 2.
By using the results of Benjamini et al. (2014), we first demonstrate in Lemma 1 that a random graph conditioned on having a degree sequence of size n, where all elements are between 3 and \(n^{0.02}\), has a conductance that is lowerbounded by a constant \(\alpha >0\). Then we use Lemma 1 to show in Lemma 2 that almost all graphs on n vertices with a constant average degree and a minimum degree of at least \(k\ge 3\) have \(\Omega (1/\log n)\) vertex expansion. A corollary of Lemma 2.2 in Friedman and Krivelevich (2021), stated in this section as Lemma 3, allows us to use this vertex expansion property to establish provable robustness in Theorem 2.
Lemma 1
Given a graphical sequence \(D=(d_1, \dots , d_n)\) with \(\sum _{i=1}^n d_i \in O(n)\) and \(n^{0.02} \ge d_1 \ge d_n \ge 3\), there exists a constant \(\alpha >0\) such that for sufficiently large n, a random simple graph G with degree sequence D almost surely satisfies \(\Phi (G) \ge \alpha\).
Proof
Let \(\mathcal {M}\) be a graph with degree sequence D generated via the random pairing model. This model assigns \(d_i\) dots to each vertex in the set \(\{v_1, v_2, \dots , v_n\}\), and then pairs the dots uniformly at random. An edge is drawn between vertex \(v_i\) and vertex \(v_j\) for each dot corresponding to the ith vertex paired with a dot corresponding to the jth vertex. This can result in multiple edges between two vertices, as well as some vertices having selfloops. However, conditioned on \(\mathcal {M}\) being simple, it is distributed exactly uniformly among all simple graphs with degree sequence D.
Let \(A_\alpha\) denote the event that there exists a constant \(n_0\in \mathbb {N}\) such that \(\Phi (\mathcal {M})<\alpha\) for all \(n\ge n_0\), and let \(\mathcal{B}\) denote the event that \(\mathcal {M}\) is simple. In light of the above observation, it suffices to show \(\textsf{Pr}(\mathcal{A}_\alpha \mathcal{B})\in o(1)\) for some constant \(\alpha >0\). Since \(\sum _{i=1}^n d_i\in O(n)\), we can use Lemma 5.2 in Benjamini et al. (2014) to obtain \(\textsf{Pr}(\mathcal{B})\in \Omega (1)\). Therefore
Under our assumptions, Lemma 5.3 in Benjamini et al. (2014) implies there is a constant \(\vartheta >0\) such that \(\textsf{Pr}(\mathcal {A_\vartheta })\in o(1)\). Given this, the above inequality yields \(\textsf{Pr}(\mathcal {A_\vartheta }\mathcal {B})\in o(1)\). \(\square\)
Lemma 2 states that, for a constant \(k\ge 3\), almost all graphs with constant average degree and a minimum degree of at least k exhibit \(\Omega (1/\log n)\) vertex expansion. To prove Lemma 2, we show that such graphs almost surely satisfy the degree sequence conditions specified in Lemma 1 and have a maximum degree of \(o(\log n)\). We then complete the proof by combining Lemma 1 with the inequality \(h(G)\ge \frac{\delta (G)}{\Delta (G)}\Phi (G)\), which holds for any graph G.
Lemma 2
Let G be a graph drawn from \(\textsf {G}(n,m,k)\), where \(2m=cn\) and \(c\ge k \ge 3\) for some constants c, k. If n is large enough, then almost surely \(h(G)\in \Omega (1/\log n)\).
Proof
Note that \(\delta (G)\ge 3\) and \(h(G)\ge \frac{\delta (G)}{\Delta (G)}\Phi (G)\). Therefore, it is sufficient to show that there is almost surely a constant \(\alpha >0\) such that if n is large enough, then \(\textsf{Pr}\left[ (\Phi (G)<\alpha )\bigcup (\Delta (G)>\log n)\right] \in o(1)\). By applying the union bound, we have
Therefore, it suffices to show that there is almost surely a constant \(\alpha >0\) such that if n is large enough, then \(\textsf{Pr}[\Phi (G)<\alpha ]\in o(1)\) and \(\textsf{Pr}[\Delta (G)> \log n] \in o(1)\).
We first show \(\textsf{Pr}[\Delta (G)> \log n] \in o(1)\). To this end, part (ii) of Lemma 1 in Bollobás et al. (2000) suggests that, under the assumptions of the theorem, it is sufficient to show that the maximum occupancy of a box (or bin) is \(o(\log n)\) in a model called \(\mathcal {O}(n,2m, k)\). In this model, 2m balls are thrown uniformly at random into n bins, conditioned on each bin having at least k balls. Within the proof of Lemma 1 in Bollobás et al. (2000), it is explicitly stated that: The maximum occupancy of any box in \(\mathcal {O}(n,2m, k)\) is \(o(\log n)\) with the probability of the complementary event \(n^{O(\log \log n)}\). This establishes that \(\textsf{Pr}[\Delta (G)> \log n] \in o(1)\), as we wanted to show. Next, we prove \(\textsf{Pr}[\Phi (G)<\alpha ]\in o(1)\).
Let \(\xi _\alpha\) be the event that there exists a constant \(n_0\in \mathbb {N}\) such that \(\Phi (G) <\alpha\) for all \(n \ge n_0\). Consider \(\mathcal {D}=\{D(G):G\in G(n,m,k)\}\) and let \(\widetilde{\mathcal {D}}\) be a subset of degree sequences in \(\mathcal {D}\) where the maximum degree is at most \(\log n\). We can write \(\textsf{Pr}[\xi _\alpha ]\) as
We prove that \(\mathcal {A}_\alpha \in o(1)\) and \(\mathcal {B}_\alpha \in o(1)\), for some constant \(\alpha >0\). Note that
The righthand side of the above inequality is \(\textsf{Pr}[\Delta (G)> \log n]\), which we have shown is asymptotically zero, under our assumptions. Hence, \(\mathcal {A_\alpha }\in o(1)\). Note that
As the assumptions of Lemma 1 are satisfied for all \(d\in \tilde{\mathcal {D}}\), we can use it to show the existence of a constant \(\vartheta >0\) such that the righthand side of the inequality above is asymptotically zero for \(\alpha =\vartheta\). Therefore, we have \(\textsf{Pr}[\xi _\vartheta ]=\mathcal {A}_\vartheta +\mathcal {B}_\vartheta \in o(1)\). \(\square\)
Lemma 3 can be viewed as a direct corollary of Lemma 2.2 in Friedman and Krivelevich (2021), by observing that the notion of \((\frac{n}{2}, \nu )\)expander in Friedman and Krivelevich (2021) is equivalent to the definition of \(\nu\)vertex expander given in this paper. Using Lemma 3 and the expansion properties of graphs drawn from \(\textsf {G}(n,m,k)\), as stated in Lemma 2, we can prove Theorem 2.^{Footnote 1}
Lemma 3
(Friedman and Krivelevich 2021) Let \(G=(V,E)\) be a \(\nu\)expander graph with n vertices, where \(0 < \nu \le 1\). For any \(0<\epsilon \le \frac{\nu ^2}{16}\) and \(V_0 \subseteq V\) with \(V_0 \le \epsilon n\), there exists \(U \subseteq V {{\setminus }} V_0\) such that \(U \ge (1\frac{3\epsilon }{\nu })n\) and G[U] is a \(\frac{\nu }{2}\)expander.
Theorem 2
Let G be a graph drawn from \(\textsf {G}(n,m,k)\), where \(2m=cn\) and \(c\ge k\ge 3\) for some constants c, k. If n is large enough, then almost surely, G is connected and \(R_G(B)= 1o(1)\) for any vertex sequence B that satisfies \(B\in o(n/\log ^2 n)\).
Proof
By Lemma 2, G is almost surely a \(\nu\)expander where \(\nu =\frac{\alpha }{\log n}\) for some constant \(\alpha >0\). Since n is assumed to be large enough, it suffices to show that for \(0<\nu \le 1\), the claims of the theorem hold when G is a \(\nu\)expander with n vertices.
For the sake of contradiction, assume G is not connected; then, it must have at least two connected components. Let A be the component with the smallest number of vertices. Obviously, \(\partial A=\emptyset\) and \(A\le n/2\). However, since G is a \(\nu\)expander with \(\nu >0\) and \(A\le n/2\), we have \(\partial A\ge \nu A>0\), which contradicts \(\partial A=\emptyset\). Given that assuming G is not connected leads to a contradiction, G must be connected.
It remains to show the second part of the theorem. If \(B = 0\), we are done. So, we assume \(B>0\) and let \(V_0\) denote the set of vertices in B. Since n is assumed to be sufficiently large and \(B=V_0\in o(n/\log ^2 n)\), we can assume \(V_0\le \epsilon n\), where \(\epsilon =\frac{\nu ^2}{16}\). Given that all the requirements of Lemma 3 are met, we can apply it to conclude the existence of \(U\subseteq V{\setminus } V_0\) such that G[U] is a \(\frac{1}{2}\nu\)vertex expander and \(U\ge (1\frac{3\epsilon }{\nu })n\). By an argument similar to the one presented in the previous paragraph, we can conclude that G[U] is connected. Therefore, the largest connected component of \(G[V{\setminus } V_0]\) must also contain at least \((1\frac{3\epsilon }{\nu })n\) or equivalently \((1\frac{3\nu }{16})n\) vertices. Given the definition of \(R_G(B)\), this concludes the proof. \(\square\)
We can see from Theorem 2, that a graph drawn from G(n, m, k) almost surely has asymptotically optimal robustness when the fraction of vertices removed is \(o(n/\log ^2 n)\) (e.g., when at most \(n^{0.99}\) vertices are removed).
In studies on the robustness of networks such as Albert et al. (2000), scalefree instances with a constant average degree are compared to sizematching random networks. It is known that such random networks with a constant average degree are almost surely disconnected (Erdős and Rényi 1959; Molloy and Reed 1998). However, our results in this section suggest that such random networks become both connected and provably robust when their minimum degree is constrained. Therefore, the conclusions of Albert et al. (2000) may change significantly in a fairer setting where sizematching random networks also have a minimum degree at least as large as that of scalefree instances compared to them. This observation was one of the main motivations for our experiments in “Experiments” section.
Experiments
The conclusions in Albert et al. (2000) primarily stemmed from experiments on synthetic scalefree networks generated by \(\textsf {PA}(n,k)\). We use the same model of synthetic networks to compare the robustness of preferentialattachment networks with sizematching random networks and random networks of the same size whose minimum degree is at least as large. To this end, we generate 100 networks using the model \(\textsf {PA}(n,k)\) with \(n=10{,}000, k=3\), and then draw an equal number of networks from \(\textsf {G}(n,m)\) and \(\textsf {G}(n,m,k)\), respectively, where \(m=nk\).^{Footnote 2} For each of the 300 networks, we then compute their robustness scores under random failures and targeted attacks, where the fraction of removed vertices is \(\beta \in \{0.05, 0.1, 0.2\}\).
The results shown in Fig. 1 confirm that preferentialattachment networks are “robustyetfragile” when compared to random graphs of the same size, i.e., they are more fragile against targeted attacks but more robust against random failures. However, such networks are more vulnerable against both, targeted attacks and random failures, than sizematching random networks constrained to have at least the same minimum degree. We corroborate this finding in the following sections under varying conditions.
Note that the minimum degree of graphs appears to be a crucial property for their robustness. This is suggested asymptotically by Theorem 2 and is empirically evidenced, since the robustness scores of the graphs drawn from \(\textsf {G}(n,m,k)\) are very close to the theoretical upper bounds indicated by dashed lines. This is further investigated in “Consistency of nearoptimal robustness” section.
Sensitivity to the choice of the fraction of removed vertices
In this section, we evaluate the sensitivity of our observed patterns in Fig. 1 to the choice of the proportion of removed vertices. For this purpose, we repeat the procedure to create Fig. 1, but instead of choosing moderate proportions of removed vertices \(\beta \in \{0.05, 0.1, 0.2\}\), we choose relatively higher proportions by increasing each previously considered proportion by 0.2, i.e., we consider \(\beta \in \{0.25, 0.3, 0.4\}\). The result is shown in Fig. 2, from which we observe that the claimed patterns based on Fig. 1 still hold in general. The main difference is that the robustness of the random networks drawn from \(\textsf {G}(n,m,k)\) becomes less optimal when the proportion of removed vertices increases noticeably.
Sensitivity to the choice of parameters in the PA model
Here, we evaluate the sensitivity of the patterns inferred based on Figs. 1 and 2, to the variation of parameters n and k in the underlying preferential attachment model \(\textsf {PA}(n,k)\). To this end, we vary \(k\in \{3,4,5\}\) for fixed \(n=10{,}000\) and \(n\in \{1000, 10{,}000, 20{,}000\}\) for fixed \(k=3\), all else being equal and precisely as in the setting presented at the beginning of “Experiments” section.
For each pair of n and k considered, we use zscores to compare the networks generated from \(\textsf {PA}(n,k)\) with random networks drawn from \(\textsf {G}(n,m)\) and random networks drawn from \(\textsf {G}(n,m,k)\), where \(m=nk\).^{Footnote 3} The comparison refers to their expected robustness score when \(\beta \in \{0.05, 0.1, 0.2, 0.25, 0.3, 0.4\}\) fraction of vertices are removed under targeted attacks or random failures. The obtained zscores are visualized in Fig. 3. Our results suggest that networks generated by using preferential attachment are consistently more robust against random failures but more fragile against targeted attacks when compared to random networks of the same size. This is underscored by the fact that the points corresponding to the latter networks are located in the fourth quadrants in Fig. 3. However, we note that preferentialattachment networks are always more vulnerable to targeted attacks and random failures when compared to random networks of the same size whose minimum degree is at least as large. This is underscored by the fact that the points corresponding to the latter networks are located in the first quadrants in Fig. 3. We note that our claimed patterns hold for varying n or k, however, the patterns we assert become more significant as n increases and less significant as k increases. In other words, the patterns we discuss are most evident for larger and sparser networks, which are generally of greater relevance.
Sensitivity to the choice of the instances
In Figs. 1 and 2, the cohesion of points belonging to the same network type and their separation from points corresponding to other network types can be observed for any fixed proportion of removed vertices \(\beta \in \{0.05, 0.1, 0.2, 0.25, 0.3, 0.4\}\). This observation suggests that the patterns inferred from these figures do not depend on a particular choice of instances from the underlying network models. To measure this quantitatively, we compute a silhouette score for each fixed \(\beta\) using the Euclidean distance metric by assigning scattered points of each specific type of network to a group.^{Footnote 4} The computed silhouette scores are presented in Table 1.
These scores indicate how well instances of the same network type are clustered and how well these clusters separate based on their robustness scores when the networks contained in each cluster are subjected to targeted attacks or random failures. From the results in Table 1, we can observe that the distinction of the network clusters considered with respect to their robustness is consistently high and is not particularly affected by the choice of specific instances. To assess the sensitivity of these observation to changes in the parameters of the underlying preferential attachment model \(\textsf {PA}(n,k)\), we vary \(k\in \{3,4,5\}\) for a fixed \(n=10{,}000\) and vary \(n\in \{1000, 10{,}000, 20{,}000\}\) for a fixed \(k=3\), all else being the same. We give in Table 2 for each \(\beta \in \{0.05, 0.1, 0.2, 0.25, 0.3, 0.4\}\) the corresponding silhouette scores. As highlighted by the bold values in Table 2, the cohesion within each group and the separation between different groups increase with n and decrease with k when the underlying preferentialattachment networks are instances of \(\textsf {PA}(n,k)\). This implies that the distinction in robustness between the different network types that we consider becomes more apparent when the focus is on larger and sparser networks.
Furthermore, Tables 1 and 2, indicate that for each fixed combination of n and k, the corresponding silhouette scores increase with the fraction of removed vertices \(\beta\). This means that as the fraction of removed vertices increases, the results become less dependent on a particular choice of network instances.
Sensitivity to adaptive targeted attacks
In this section, we investigate whether the patterns observed in Figs. 1 and 2 remain consistent under adaptive targeted attacks. In these attacks, the vertices with the highest current degree are removed first, rather than those with the highest initial degree. Such attacks were proposed in Broder et al. (2000) and have been shown to be more effective in dismantling networks than those based on initial degree (Holme et al. 2002; Wu and Holme 2011). To this end, we consider nine combinations of n and k with \(n\in \{1000, 10{,}000, 20{,}000\}\) and \(k\in \{3,4,5\}\), and draw 100 networks from each of the \(\textsf {PA}(n,k), \textsf {G}(n,m)\), and \(\textsf {G}(n,m,k)\) models, where \(m=nk\).
Figure 4 reports the average robustness score (over 100 instances) for each network model when the \(\beta \in \{0.01,0.02, \dots , 0.99\}\) portion of the vertices are removed in adaptive targeted attacks. We observe that the networks drawn from \(\textsf {G}(n, m)\) and \(\textsf {G}(n,m,k)\) are very similar in terms of their robustness against adaptive targeted attacks. Moreover, we observe that the networks drawn from these two models are noticeably more robust than those generated from \(\textsf {PA}(n,k)\). However, note that this differentiation in robustness does not depend appreciably on n but becomes apparent sooner (i.e., for a lower fraction of vertices removed) when the average degree of the generated graphs is lower (i.e., when k is smaller).
In general, we can conclude that the patterns claimed based on the observations made in Figs. 1 and 2 are consistent under adaptive targeted attacks.
Consistency of nearoptimal robustness
In our experiments, we have used networks drawn from \(\textsf {G}(n,m,k)\), where \(m=nk\). For \(n=10{,}000, k=3\), we have seen in Figs. 1 and 2 that these networks exhibit nearoptimal robustness against nonadaptive targeted attacks based on initial degree and random failures when only a moderate fraction of their vertices are removed. Here, we evaluate the sensitivity of this pattern to adaptive targeted attacks based on the current degree and its sensitivity to variations in n, k, and \(\beta\). To this end, we consider nine combinations of n and k with \(n\in \{1000, 10{,}000, 20{,}000\}\) and \(k\in \{3,4,5\}\). Then we draw 100 networks from \(\textsf {G}(n,m,k)\) and compute their average robustness score normalized to the maximum achievable robustness score when \(\beta \in \{0, 0.01, \dots , 0.99, 1\}\) portion of the vertices are removed under adaptive targeted attacks, nonadaptive targeted attacks, or random failures.^{Footnote 5} The result is shown in Fig. 5.
When not more than \(30\%\) of the vertices are removed by targeted attacks or random failures, Fig. 5 illustrates the consistent nearoptimal robustness of the networks drawn from \(\textsf {G}(n,m,k)\) in the case \(m=nk\) for a constant \(k\ge 3\). We can see that this \(30\%\) threshold does not depend appreciably on n but increases with k. For example, when \(k=5\), we see across different n that the robustness score does not noticeably deviate from its optimal value when no more than \(50\%\) of vertices are removed in targeted attacks or random failures.
From the discussions here, we can conclude that for a constant \(k\ge 3\) and \(m=nk\), the nearoptimal robustness of networks drawn from \(\textsf {G}(n,m,k)\) is consistent as long as the fraction of vertices removed by targeted attacks or random failures is not too large.
As a side observation, we can also see that the difference between adaptive and nonadaptive targeted attacks in dismantling the graphs drawn from \(\textsf {G}(n, nk, k)\) becomes apparent only for higher fractions of removed vertices and becomes more pronounced as k increases, but does not depend appreciably on n.
Figure 5 shows that, for each considered vertex removal strategy, there is a distinct threshold value \(\beta ^{*}\) for the fraction of removed vertices \(\beta\), at which the robustness of networks drawn from \(\textsf {G}(n,nk,k)\) deviates considerably from the optimal value. Upon closer investigation, we realized that this threshold corresponds to the point where the secondlargest connected component attains its maximum size during the vertex removal process. Specifically, for \(\beta <\beta ^*\), the size of the secondlargest connected component generally increases, while it starts to decrease for \(\beta >\beta ^*\). At \(\beta =\beta ^*\), it is noteworthy that the graph’s average degree is strictly less than two, and its kcore is empty for all \(k\ge 3\). This implies that the graph is 2degenerate, meaning the maximum degree in every induced subgraph is at most 2.
It is desirable to establish a lower bound for \(\beta ^*\) that is independent of the chosen vertex removal strategy. To this end, note that the robustness and density of graphs drawn from \(\textsf {G}(n,m,k)\) increase with m for fixed n and k. Therefore, random kregular ones are the least robust and least dense of such graphs where m attains its minimum possible value of \(\frac{1}{2}kn\). For \(k\ge 3\), it has been shown that an optimal attacker must remove at least a fraction \(\frac{k2}{2k2}\) of all vertices to partition a random kregular graph into connected components of sublinear size (see “Bounds on optimal attacks” in Balashov et al. 2019). As a result, for graphs drawn from \(\textsf {G}(n,m,k)\), we can conclude that \(\beta ^*\ge \frac{k2}{2k2}\) when \(k\ge 3\), regardless of the vertex removal strategy employed.
Preferentiality and robustness
In this section, we aim to experimentally investigate how preference in attachment affects the robustness of incrementally constructed networks which inherently have a minimum degree of at least k, such as those generated by the preferential attachment model \(\textsf {PA}(n,k)\). To isolate the effects of preference in attachment, we also explore the similar \(\textsf {UA}(n,k)\) model where newly added vertices connect to k existing vertices selected uniformly at random, without a preference for highly connected vertices. Since both \(\textsf {PA}(n,k)\) and \(\textsf {UA}(n,k)\) networks belong to G(n, m, k) where \(m=nk\), we also compare them with typical members of this class using the \(\textsf {G}(n,m,k)\) model.
In our experiments, we vary \(n\in \{1000, 10{,}000, 20{,}000\}\) for a fixed \(k=3\), and \(k\in \{3,4,5\}\) for a fixed \(n=10{,}000\). For each fixed pair of n and k, we draw 100 instances from the following network models: \(\textsf {UA}(n,k)\), \(\textsf {PA}(n,k)\), and \(\textsf {G}(n,m,k)\); where \(m=nk\). For each fixed pair of n and k, we then compare the robustness of the corresponding generated networks against targeted attacks based on the initial degree and random failures when \(\beta \in \{0.05, 0.1, 0.2, 0.25, 0.3, 0.4\}\) fraction of the vertices are removed. For this purpose, we used zscores as described in “Sensitivity to the choice of parameters in the PA model” section.
The results depicted in Fig. 6 indicate that \(\textsf {UA}(n, k)\) graphs are consistently more robust under both targeted attacks and random failures than \(\textsf {PA}(n, k)\) graphs, but more fragile than \(\textsf {G}(n,m,k)\) graphs where \(m=nk\). We also observe that as n increases, this pattern becomes more pronounced for a fixed fraction of removed vertices \(\beta\), while it diminishes as k increases. Therefore, we can conclude that this trend is particularly applicable to larger and sparser networks. Notably, our findings indicate that a preference for attachment to wellconnected vertices in network growth models does not enhance network robustness, but in fact, affects it adversely.
Conclusions
We have shown that, for any constant \(k\ge 3\), almost all graphs in which the number of edges is linear in the number of vertices (i.e., the average degree is upperbounded by a constant) and the minimum degree is at least k, are connected and provably robust against any vertex removal strategy.
Motivated by this new theoretical result, we have shown experimentally that the dictum “robustyetfragile” is not a fitting characterization of preferentialattachment networks, let alone scalefree networks in general, because it stems from a poorly chosen baseline. It appears that the previously assessed robustness is largely due to their constant minimum degree, rather than their skewed degree distribution.
Furthermore, we have shown that in the context of network growth models, any preference for attachment to wellconnected vertices does not confer an advantage in terms of robustness against random failures or targeted attacks.
Our results show, contrary to the common belief, that preferential attachment graphs are not particularly robust against random failures. What robustness they have is a consequence of a lowerbounded minimum degree, which is guaranteed by the attachment process. The skewed degree distribution obtained from preferential attachment is not strengthening robustness but, in fact, reducing it when compared to uniformly random attachment.
Availability of data and materials
The generated data and software code are publicly available in the GitHub repository: https://github.com/RouzbehHasheminezhad/ANSRobustnessPA
Notes
A weaker version of Theorem 1 in our preliminary paper (Hasheminezhad and Brandes 2023) contained an error in the proof, which arose from the incorrect use of Lemma 5.3 from Benjamini et al. (2014). Although the lemma provides a statement on conductance, it was erroneously used directly to argue about vertex expansion, leading to the possibly incorrect conclusion that for some constant \(\alpha >0\), almost all graphs G drawn from \(\textsf {G}(n,m,k)\) satisfy \(h(G)\ge \alpha\), under the assumptions of the theorem. We have now rectified this issue by using a wellknown inequality that makes a connection between the concepts of vertex expansion and conductance. Although our revised approach can only establish an almost sure vanishing vertex expansion, we could adapt our analysis to provide improved theoretical guarantees compared to the preliminary version of the paper.
When sampling random graphs from \(\textsf {G}(n,m)\), we reject instances with less than 96% of their vertices in the largest connected component and discard all vertices outside the largest connected component. Note that such steps are essential to ensure an equitable setting, where the generated random graphs are initially connected and have roughly the same size as the preferentialattachment graphs compared to them.
Given a group X and a reference group Y, both of size N, with respective means \(\mu _X, \mu _Y\) and standard deviations \(\sigma _X, \sigma _Y\), we compute the corresponding zscore as \(\sqrt{N}(\mu _X\mu _Y)/(\sqrt{\sigma ^2_X+\sigma ^2_Y})\). Its positive and negative values represent a tendency of elements in X to reach values above and below the reference mean \(\mu _Y\), respectively.
Given a set of objects, each assigned to a group, and a distance metric defined between all pairs of these objects, the silhouette score measures the similarity of an object to its group (cohesion) compared to other groups (separation) (Rousseeuw 1987). This score ranges from \(1\) to \(+1\), with higher scores indicating that the data are well clustered and lower scores indicating that the data are poorly clustered.
Note that for any connected graph, the maximum achievable robustness score after removing \(\beta\) portion of the vertices is \(1\beta /2+o(1)\), as given in “Network robustness” section.
References
Albert R, Jeong H, Barabási AL (2000) Error and attack tolerance of complex networks. Nature 406(6794):378–382
Anastos M, Frieze A (2020) Hamilton cycles in random graphs with minimum degree at least 3: An improved analysis. Random Struct Algorithms 57(4):865–878
Balashov N, Cohen R, Haber A, Krivelevich M, Haber S (2019) Optimal shattering of complex networks. Appl Netw Sci 4(1):99
Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Batagelj V, Brandes U (2005) Efficient generation of large random networks. Phys Rev E 71(3):036113
Batagelj V, Zaveršnik M (2011) Fast algorithms for determining (generalized) core groups in social networks. Adv Data Anal Classif 5(2):129–145
Benjamini I, Kozma G, Wormald N (2014) The mixing time of the giant component of a random graph. Random Struct Algorithms 45(3):383–407
Bollobás B (1980) A probabilistic proof of an asymptotic formula for the number of labelled regular graphs. Eur J Comb 1(4):311–316
Bollobás B, Cooper C, Fenner TI, Frieze AM (2000) Edge disjoint Hamilton cycles in sparse random graphs of minimum degree at least \(k\). J Graph Theory 34(1):42–59
Broder A, Kumar R, Maghoul F, Raghavan P, Rajagopalan S, Stata R et al (2000) Graph structure in the web. Comput Netw 33(1):309–320
Doyle JC, Alderson DL, Li L, Low S, Roughan M, Shalunov S et al (2005) The robust yet fragile nature of the Internet. Proc Natl Acad Sci 102(41):14497–14502
Erdős P, Rényi A (1959) On random graphs I. Publ Math Debr 6:290–297
Friedman L, Krivelevich M (2021) Cycle lengths in expanding graphs. Combinatorica 41(1):53–74
Giakkoupis G, Sauerwald T (2012) Rumor Spreading and Vertex Expansion. In: Proceedings of the 2012 annual ACMSIAM symposium on discrete algorithms (SODA). Proceedings. Society for Industrial and Applied Mathematics, pp 1623–1641
Hasheminezhad R, Brandes U (2022) Constructing Provably Robust ScaleFree Networks. In: Ribeiro P, Silva F, Mendes JF, Laureano R (eds) Proceedings of the 7th international winter conference on network science (NetSciX 2022), vol. 13197 of lecture notes in computer science. Springer, Cham, pp 126–139
Hasheminezhad R, Brandes U (2023) Robustness of preferentialattachment graphs: shifting the baseline. In: Cherifi H, Mantegna RN, Rocha LM, Cherifi C, Micciche S (eds) Complex networks and their applications XI. Studies in computational intelligence. Springer, Berlin, pp 445–456
Hasheminezhad R, Boudourides M, Brandes U (2020) Scalefree networks need not be fragile. In: Proceedings of the 2020 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM 2020). IEEE, The Hague, pp 332–339
Holme P, Kim BJ, Yoon CN, Han SK (2002) Attack vulnerability of complex networks. Phys Rev E 65(5):056109
Janson S, Luczak MJ (2008) Asymptotic normality of the \(k\)core in random graphs. Ann Appl Probab 18(3):1085–1137
Klau GW, Weiskircher R (2005) Robustness and resilience. In: Brandes U, Erlebach T (eds) Network analysis. Volume 3418 of lecture notes in computer science. Springer, Berlin, pp 417–437
Molloy M, Reed B (1998) The size of the giant component of a random graph with a given degree sequence. Comb Probab Comput 7(3):295–305
Petersen C, Rotbart N, Simonsen JG, WulffNilsen C (2016) Near optimal adjacency labeling schemes for powerlaw graphs. In: Chatzigiannakis I, Mitzenmacher M, Rabani Y, Sangiorgi D (eds) Proceedings of the 43rd international colloquium on automata, languages, and programming (ICALP 2016). vol 55 of Leibniz international proceedings in informatics. Schloss Dagstuhl – LeibnizZentrum für Informatik, pp 133:1–133:15
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Schneider CM, Moreira AA, Andrade JS, Havlin S, Herrmann HJ (2011) Mitigation of malicious attacks on networks. Proc Natl Acad Sci 108(10):3838–3841
Wu ZX, Holme P (2011) Onion structure and network robustness. Phys Rev E 84(2):026106
Acknowledgements
Not applicable.
Funding
Open access funding provided by Swiss Federal Institute of Technology Zurich
Author information
Authors and Affiliations
Contributions
All authors contributed to the research design, the theoretical results, and the writing of the manuscript. RH conducted the experiments and generated the figures. Both authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A preliminary version of this paper has appeared in Hasheminezhad and Brandes (2023).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Hasheminezhad, R., Brandes, U. Robustness of preferentialattachment graphs. Appl Netw Sci 8, 32 (2023). https://doi.org/10.1007/s41109023005565
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41109023005565