Skip to main content

Optimal shattering of complex networks

Abstract

We consider optimal attacks or immunization schemes on different models of random graphs. We derive bounds for the minimum number of nodes needed to be removed from a network such that all remaining components are fragments of negligible size.We obtain bounds for different regimes of random regular graphs, Erdős-Rényi random graphs, and scale free networks, some of which are tight. We show that the performance of attacks by degree is bounded away from optimality.Finally we present a polynomial time attack algorithm and prove its optimal performance in certain cases.

Introduction

One of the most studied questions in complex networks is the resilience of networks under different failure models and attack strategies (Albert et al. 2000; Cohen et al. 2000; Callaway et al. 2000). In particular, one wishes to know the optimal attack strategy that will lead to fragmentation by removal of a minimal fraction of the nodes. This information is important for estimating the vulnerability of network infrastructures, and also for devising optimal immunization strategies for populations and computer networks.

The main methods that have been proposed for targeted attacks on networks via node removal have been based on attack by highest degree (Albert et al. 2000; Callaway et al. 2000; Cohen et al. 2001), and attack by highest betweenness centrality (Magoni 2003). Some methods based on more advanced algorithms for graph partitioning have also been proposed (Paul et al. 2007), and led to improved upper bounds on the minimal fraction of nodes that should be removed to shatter a networks.

In recent years, several works studied optimal attacks on networks, presenting sophisticated, highly efficient algorithms for choosing minimal sets of nodes whose removal leads to complete fragmentation of the network. In (Braunstein et al. 2016) an efficient dismantling method is presented, using the replica method on the generating function. In (Ren et al. 2019) efficient dismantling is considered when costs are attributed to the different nodes. In Morone and A. Makse (2015) finding sets of influential nodes is considered using the cavity and Extremal Optimization (EO) methods. In Mugisha and Zhou (2016) belief propagation is used to find efficient shattering sets. In Osat et al. (2017) optimal attacks on multiplex networks are studied using simulated annealing methods.

As for measuring the resilience of a network to shocks, there are again quite a few papers. The following is far from being a comprehensive survey. In Cerqueti et al. (2019) a shock is injected on one node and the resilience of the (weighted) network is measured by its ability to absorb this shock. In Chen and Cheng (2015) network robustness is evaluated in terms of the ability to identify the attack prior to network disruption. The affect of nodes removal on the diameter of the network, a natural parameter, was studied in Ferraro and Iovanella (2018). Both of these studies regard attack by degree as representative for an intentional attack. An important breakthrough was made in Gao et al. (2016), where the dynamics of the system was accounted for, in addition to network topology. One may ask about the resilience of the community structure in a network, and indeed this approach was taken in Ramirez-Marquez et al. (2018), and tested against link removal. For a review of definitions and measures of system resilience we refer the reader to Hosseini et al. (2016).

In this work, we present results for several network classes, giving upper and lower bounds on the size of the minimal set of nodes to be removed in order to shatter (or dismantle) a network. We first survey the method of generating functions and results on random attacks and attack by degree, in order to show that random attack and attack by degree are not asymptotically optimal for any class of generalized (configuration model) random graphs. We then present exact results and bounds on the size of the shattering set for various random graph classes. Eventually, we present a polynomial time algorithm for efficient shattering of random graphs. We show, using exact methods, that for 3-regular graphs our algorithm obtains asymptotically optimal results. The performance of this algorithm for other classes of random graphs remains an open question.

Random attacks

The problem of removing nodes in order to shatter the network into small pieces is of importance both in order to determine the resilience of a network to various attacks and in order to immunize a network (for example, a social network) against the spreading of an epidemic disease. By “shattering” a network, we mean breaking the network into small components, each of which having size o(n), where n is the number of nodes. That is, breaking the network into pieces whose sizes are negligible compared to the original network, by means of nodes removal.

The simplest attack mechanism on a network is removing nodes uniformly at random. This is the standard percolation model. In order to study the effect of this attack on the network, one can employ the generating function method (See, e.g., (Callaway et al. 2000)). We consider the configuration model, where no correlations exist between neighbouring nodes. Given a degree distribution P(k), and probability p(k) for a degree k node to exist (i.e., not to be deleted), one can write the generating function for this distribution as

$$ F_{0}(x)=\sum_{k=0}^{\infty} P(k)p(k)x^{k}\;. $$
(1)

Similarly, one can write the generating function for the reciprocal degree distribution of a node reached by following an edge (i.e., the degree of the node disregarding the edge through which it was reached):

$$ F_{1}(x)=\frac{F_{0}^{\prime}(x)}{\langle k\rangle}=\frac{\sum_{k=0}^{\infty} kP(k)p(k)x^{k-1}}{\langle k\rangle}\;, $$
(2)

where 〈k〉 is the average degree. The generating function for branch sizes reached by following a random edge is given by the recursive equation.

$$ H_{1}(x)=1-F_{1}(1)+xF_{1}(H_{1}(x))\;, $$
(3)

where F1(1) is the probability of a node reached by following a random edge to exist. and the generating function for the probability of a node to belong to a component of some finite size is given by

$$ H_{0}(x)=1-F_{0}(1)+xF_{0}(H_{1}(x))\;. $$
(4)

H0(1) is the normalization of H0(x). It may happen that H0(1)=1, in which case all components are finite and no giant component exists, or that H0(x)<1 in which case, a giant component exists and contains a fraction 1−H0(1)=1−F0(1)+xF0(u) of the nodes, where u is the solution of the self consistent equation

$$ u=1-F_{1}(1)+F_{1}(u)\;. $$
(5)

One can observe that such a solution exists only if \(F_{1}^{\prime }(1)>1\).

For random attacks p(k)=p:=1−q is independent of k and using Eq. (5) one can deduce that the criterion for the existence of a giant component is that there exist a solution u<1, which exists only if

$$ p>p_{c}:=\frac {1}{\kappa-1}\;, $$
(6)

where \(\kappa :=\frac {\langle k^{2}\rangle }{\langle k\rangle }\) is the ratio of the first two moments of the distribution (Cohen et al. 2000). Specifically, using the moments of the constant and Poisson distributions, respectively, one can deduce that the percolation thresholds for the random d-regular and Erdős-Rényi networks are \(\frac {1}{d-1}\) and \(\frac {1}{\langle k\rangle }\).

Another result that can be deduced using this formalism is that the probability of a (non-deleted) node of degree k to belong to a finite component is the probability that all of the branches emanating from it are finite, i.e., the probability is uk.

Targeted attack strategies

Naturally, random attacks are not expected to give optimal results. Indeed, a simple and more effective attack strategy is starting the removal with the high degree nodes, as they play a more substantial role in the network connectivity. Using a function

$$ q(k)=\left\{ \begin{array}{ll} 1 & k< k_{0}\\ 0 & k>k_{0}\\ \alpha & k=k_{0} \end{array}\right., $$
(7)

for some k0 and α. Solving for the values of k0 and α that give criticality, one can find the critical fraction for removal.

The attack by degree strategy is, however, suboptimal. This can be seen from the fact that for any finite k, the probability that a removed node of degree k does not even belong to the giant component, and therefore its removal is unbeneficial, is uk. Therefore, a finite fraction of the removed nodes are not included in the giant component to begin with, and thus the method is not even asymptotically optimal. Furthermore, for random regular networks, targeted attack by degree is completely equivalent to random removal.

In order to develop a better attack strategy one may consider methods such as adaptive attack by degree or attack by betweenness centrality. However, these strategies are very hard to analyse.

Bounds on optimal attacks

We define cf as the minimal fraction of nodes that are to be removed before the network is shattered (i.e., becomes fragmented into sublinear components). It is clear that for any network cfqc=1−pc, where pc is the percolation threshold for random removal of nodes. For a d-regular graph (where each node has degree d) this yields the upper bound

$$c_{f} \leq 1-\frac{1}{d-1}=\frac{d-2}{d-1}. $$

However, one can ask the question: Given complete knowledge of the network, and unlimited computational power, what is the smallest fraction of the nodes that can be removed in order to break the network into small pieces, each of size o(n)? One might be tempted to consider the possibility of shattering the network by removing only a zero fraction, o(1) of the nodes. Indeed, in the case of a square grid, for example, which is a regular network with fixed degree 4, one can remove n1/4=o(n) equally spaced rows and columns and shatter the network into pieces of size \(O(\sqrt {n})\).

For some cases, such as random regular networks, this can be shown to be impossible. Indeed, improved bounds were obtained in Edwards and Farr (2001) and Edwards and Farr (2008):

$$\frac{d-2}{2d-2} \leq c_{f} \leq \frac{d-2}{d+1}. $$

In this paper we establish better lower and upper bounds on cf, focusing on sparse random graphs. In particular for random regular graphs we show

$$ 1-2\frac{\alpha(G)}{n} \leq c_{f} \leq 1-\frac{\alpha(G)}{n}\;, $$
(8)

where α(G) is the independence number of G. When d is large enough the independence number is known (Frieze and Łuczak 1992) to satisfy α(G)≈2n lnd/d. In this case we obtain

$$c_{f} \approx 1-\frac{\alpha(G)}{n}.$$

We provide matching results for Erdős-Rényi random graphs.

Results

For the analytical results, we mainly exploit structural graph properties such as expansion, domination number and independence and obtain deterministic connections to the shattering number. We then apply known estimations of these parameters for random graphs either directly or via contiguity arguments.

As an example we establish a lower bound on the minimal fraction of nodes needed to be removed in order to shatter a regular network into small components of size o(n). Consider shattering a graph to m disjoint clusters Ci,i=1,…,m by deleting a set of |S|=cfn nodes. cf is therefore the fraction of removed nodes. Notice that \(\sum _{i=1}^{m} |C_{i}|+ |S|=n\). Thus, \(\frac {1}{n}\sum _{i} |C_{i}|+c_{f}=1\). Denote by Bi the nodes on the boundary of Ci, i.e., the neighbors of nodes of cluster i outside of the cluster. Since the clusters are disconnected for all i,j, all boundary nodes for any cluster i must be removed. Therefore, for every i, all nodes in Bi are deleted. Now, in a random regular graph with high probability each cluster is locally almost tree like. Therefore, up to an additive constant, |Bi|=(d−1)|Ci|−|Ci|=(d−2)|Ci|. Since every node has exactly d neighbors, a node can not participate in more than d of the Bis. Thus \(\frac {1}{d}\sum _{i} |B_{i}|\leq |S|\). Therefore,

$$ n\geq \sum_{i} C_{i} +\frac{d-2}{d}\sum_{i} C_{i} =\frac{2d-2}{d}\sum_{i} C_{i}. $$
(9)

Summarizing the above we get:

$$ c_{f}=1-\frac{1}{n}\sum_{i} C_{i}\geq \frac{d-2}{2d-2}. $$
(10)

This approach can be shown to give asymptotically tight solution for d=3 (i.e., it matches the upper bound shown in Edwards and Farr (2001)). However, for large values of d it deviates considerably from the exact value. Indeed, for d, Eq. (10) leads to \(c_{f}\geq \frac {1}{2}\), where as will be shown below cf→1.

In order to give a better lower bound on cf for random regular graphs with large constant degree we observe the following: Random regular graphs are locally tree like, having a bounded number of short cycles. Therefore after the network is shattered we expect the remaining components to be trees. A tree can be shattered into isolated vertices by removing at most half of its nodes. Therefore, it can be deduced that the number of nodes remaining after the attack is at most twice the size of the largest independent set. Since removing all but an independent set clearly shatters the graph, we obtain Eq. 8. With some further effort we can improve the lower bound to close the gap and match the upper bound. The same approach may be applied to the Erdős-Rényi model G(N,p=c/N) where we get

$$ c_{f}\approx 1-\frac{\log c}{c}. $$
(11)

We discuss theoretical and practical implications of our results in “Summary” section.

Shattering scale free networks

Targeted attack strategies usually begin by attacking the hubs of the scale free networks. This is based on the idea that the hubs are the glue holding the network together, due to their high degree and large number of neighbours. Indeed, one may use this intuition to obtains bounds on the hardness of shattering a scale free networks.

Consider a scale free network with degree distribution

$$ P(k)=\frac {k^{-\gamma}}{\zeta(\gamma)}\;, $$
(12)

where \(\zeta (\gamma)=\sum _{k=1}^{\infty } k^{-\gamma }\). This network can be shattered using the following two stage process:

  1. 1.

    Remove all nodes above degree d. This contains a fraction of \(\sum _{k=d+1}^{\infty } k^{-\gamma }/\zeta (\gamma)\) of the nodes.

  2. 2.

    Shatter the remaining network, requiring at most \(\frac {d-2}{d+1}\) of the network.

One can choose d as to minimize the sum of these two terms in order to obtain an upper bound on the size of the required shattering set. In fact, a better bound can be obtained by noticing that after the removal of the hubs, the remaining nodes form a random network with degree distribution

$$ \tilde{P}(k)=\sum_{c=k}^{\infty} \frac {k^{-\gamma}}{\zeta(\gamma)}\binom{c}{k}p^{k} (1-p)^{c-k}\;, $$
(13)

with

$$ p=\sum_{1}^{d} \frac{kP(k)}{\langle k\rangle} $$
(14)

denoting the probability of an edge to lead to an undeleted node (i.e. the fraction of edges leading to undeleted nodes). If one can bound the size of the shattering set for a random graph with this degree distribution, a better bound on the overall shattering set size can be obtained. In particular, by showing that the remaining graph is close enough to random graph in the sense of having all local neighborhoods as almost trees with high probability, one may replace the upper bound \(\frac {d-2}{d+1}\) in Point 2 above by 1−α(G)/n.

Algorithmic aspects

Finding an optimal shattering set is NP-hard, but when the input is a random graph the problem becomes tractable. We propose the following algorithm for finding a shattering set in a graph. Below we describe the algorithm and demonstrate its asymptotic optimality for random cubic graphs. Let GGn.3 be a random cubic graph. Consider the following algorithm:

Algorithm Shatter, Phase I

  1. 1.

    Input: a graph G and threshold t

  2. 2.

    Find a Hamilton cycle H in G

  3. 3.

    Start from an arbitrary node v0 and advance along H creating a segment.

  4. 4.

    When visiting a node v, it is incident with two edges on the cycle and a third edge e. If e is the second edge in the segment going backward (in H), delete v.

Each edge is seen once going forward and once going backward, and we delete nodes for half of the edges seen going backward, we remove exactly 1/4 of the nodes. This is optimal as can be seen in Eq. (10). A demonstration of Phase I is given in Fig 1.

Fig. 1
figure 1

Phase I of Algorithm Shatter

When Phase I is complete we are left with a tree T of segments of H. These segments are with high probability of length o(n) and are unicyclic. Shattering T can be easily done by removing a center vertex of the tree, leaving at least two subtrees with at most half of the original number of segments in each. We continue in this manner until the graph is shattered, that is until every connected component is of size smaller than a predefined threshold. Summarizing we get:

Algorithm Shatter, Phase II

  1. 1.

    Let T be the tree of segments remaining after Phase I

  2. 2.

    Until the maximal size of a connected component is smaller than t

    1. (a)

      Find a center v in the largest connected component

    2. (b)

      Remove v

The vertices removed in Phase I and Phase II together form the shattering set.

The two phases are demonstrated in the following figures:

Random cubic graphs are known to be Hamiltonian (Robinson and Wormald 1992) and using a variant of ideas from the proofs from Frieze et al. (1996) we could get an algorithm finding such a cycle with high probability in time O(n7/2). The rest of the algorithm (in both phases) runs in linear time. Preliminary results using a Branch and Bound approach suggest we may be able to reduce the running time of the Hamilton cycle finding module.

Notice that unlike removal by degree or by betweeness, which are local algorithms, the first phase of Shatter finds a global structure in the network. This is our main insight — we take advantage of the network randomness in order to find a global structure, then we use this structure to achieve high performance. In particular we get an asymptotically optimal constant for random cubic graphs (Fig. 2). While, as stated below, we can not guarantee optimality in all cases, we still believe this approach is favorable to local strategies, either passive or adaptive. This belief stems from the global nature of the problem, for which it is natural to suggest a global solution, and our attempt to optimize over a set rather than greedily point by point. Indeed, even when considering shattering to components of size one (i.e. finding an independent set), greedy algorithms are known to deliver poor results (in terms of approximation factor) when applied to random graphs (Grimmett and McDiarmid 1975).

Fig. 2
figure 2

Comparison of the performance of Algorithm Shatter vs Random Removal (which, in this case is also equivalent to Attack by Degree), for a random 3-regular graph with 105 nodes

For d-regular graphs with d>3 the same algorithm can be applied: tour along the Hamiltonian path, and remove the node at which a second edge going backward is observed. However, finding a Hamiltonian cycle in a d regular graph with d>3 induces non-trivial correlations between the edges, and thus the performance of the algorithm is hard to evaluate. It is expected, however, that this algorithm will only retain an O(1/k) fraction of the nodes, which is suboptimal.

For Erdős-Rényi graphs G(n,p) one may use a similar algorithm, with the following performance improvements:

  1. 1.

    Every node of degree two can be replaced with an edge connecting its two neighbours. This is true, since long chains of length O(n) occur with negligible probability in G(n,p).

  2. 2.

    Once the above step is performed, one can consider only the giant 3-core of the network, and perform Algorithm Shatter on it.

Summary

We provide bounds on the performance of optimal attack strategies on random networks, show that local strategies fail to achieve optimality even if used in adaptive manner and demonstrate an algorithm using global structure with optimal performance in certain situations. Our work draws the limits of feasibility for this well studied problem and shows that in some cases these limits are practically achievable. Besides their immediate value, our results may have broader implications. First, Algorithm Shatter is applicable whenever a long path or cycle may be found efficiently, e.g. when the network is based on a topological structure. Moreover, we see these results as an evidence for the “global solution to global problem” approach, and hope it will help in promoting this idea.

Availability of data and material

Not applicable.

References

Download references

Acknowledgements

Not applicable.

Funding

MK was partially supported by USA-Israel BSF grant 2014361, and by ISF grant 1261/17. This work was supported by the BIU Center for Research in Applied Cryptography and Cyber Security in conjunction with the Israel National Directorate in the Prime Minister’s office.

Author information

Authors and Affiliations

Authors

Contributions

All authors conducted the research. RC and SH wrote the manuscript. AH designed, coded and ran the simulations. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Simi Haber.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Balashov, N., Cohen, R., Haber, A. et al. Optimal shattering of complex networks. Appl Netw Sci 4, 99 (2019). https://doi.org/10.1007/s41109-019-0205-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41109-019-0205-5

Keywords