Optimal shattering of complex networks

We consider optimal attacks or immunization schemes on different models of random graphs. We derive bounds for the minimum number of nodes needed to be removed from a network such that all remaining components are fragments of negligible size. We obtain bounds for different regimes of random regular graphs, Erd\H{o}s-R\'enyi random graphs, and scale free networks, some of which are tight. We show that the performance of attacks by degree is bounded away from optimality. Finally we present a polynomial time attack algorithm and prove its optimal performance in certain cases.


Introduction
One of the most studied questions in complex networks is the resilience of networks under different failure models and attack strategies [1,2,3]. In particular, one wishes to know the optimal attack strategy that will lead to fragmentation by removal of a minimal fraction of the nodes. This information is important for estimating the vulnerability of network infrastructures, and also for devising optimal immunization strategies for populations and computer networks.
The main methods that have been proposed for targeted attacks on networks via node removal have been based on attack by highest degree [1,3,4], and attack by highest betweenness centrality [5]. Some methods based on more advanced algorithms for graph partitioning have also been proposed [6], and led to improved upper bounds on the minimal fraction of nodes that should be removed to shatter a networks.
In recent years, several works studied optimal attacks on networks, presenting sophisticated, highly efficient algorithms for choosing minimal sets of nodes whose removal leads to complete fragmentation of the network. In [7] an efficient dismantling method is presented, using the replica method on the generating function. In [8] efficient dismantling is considered when costs are attributed to the different nodes. In [9] finding sets of influential nodes is considered using the cavity and Extremal Optimization (EO) methods. In [10] belief propagation is used to find efficient shattering sets. In [11] optimal attacks on multiplex networks are studied using simulated annealing methods.
As for measuring the resilience of a network to shocks, there are again quite a few papers. The following is far from being a comprehensive survey. In [12] a shock is injected on one node and the resilience of the (weighted) network is measured by its ability to absorb this shock. In [13] network robustness is evaluated in terms of the ability to identify the attack prior to network disruption. The affect of nodes removal on the diameter of the network, a natural parameter, was studied in [14]. Both of these studies regard attack by degree as representative for an intentional attack. An important breakthrough was made in [15], where the dynamics of the system was accounted for, in addition to network topology. One may ask about the resilience of the community structure in a network, and indeed this approach was taken in [16], and tested against link removal. For a review of definitions and measures of system resilience we refer the reader to [17].
In this work, we present results for several network classes, giving upper and lower bounds on the size of the minimal set of nodes to be removed in order to shatter (or dismantle) a network. We first survey the method of generating functions and results on random attacks and attack by degree, in order to show that random attack and attack by degree are not asymptotically optimal for any class of generalized (configuration model) random graphs. We then present exact results and bounds on the size of the shattering set for various random graph classes. Eventually, we present a polynomial time algorithm for efficient shattering of random graphs. We show, using exact methods, that for 3-regular graphs our algorithm obtains asymptotically optimal results. The performance of this algorithm for other classes of random graphs remains an open question.

Random Attacks
The problem of removing nodes in order to shatter the network into small pieces is of importance both in order to determine the resilience of a network to various attacks and in order to immunize a network (for example, a social network) against the spreading of an epidemic disease. By "shattering" a network, we mean breaking the network into small components, each of which having size o(n), where n is the number of nodes. That is, breaking the network into pieces whose sizes are negligible compared to the original network, by means of nodes removal.
The simplest attack mechanism on a network is removing nodes uniformly at random. This is the standard percolation model. In order to study the effect of this attack on the network, one can employ the generating function method (See, e.g., [3]). We consider the configuration model, where no correlations exist between neighbouring nodes. Given a degree distribution P(k), and probability p(k) for a degree k node to exist (i.e., not to be deleted), one can write the generating function for this distribution as Similarly, one can write the generating function for the reciprocal degree distribution of a node reached by following an edge (i.e., the degree of the node disregarding the edge through which it was reached): where k is the average degree. The generating function for branch sizes reached by following a random edge is given by the recursive equation.
where F 1 (1) is the probability of a node reached by following a random edge to exist. and the generating function for the probability of a node to belong to a component of some finite size is given by . It may happen that H 0 (1) = 1, in which case all components are finite and no giant component exists, or that H 0 (x) < 1 in which case, a giant component exists and contains a fraction where u is the solution of the self consistent equation One can observe that such a solution exists only if F ′ 1 (1) > 1. For random attacks p(k) = p := 1 − q is independent of k and using Eq. (5) one can deduce that the criterion for the existence of a giant component is that there exist a solution u < 1, which exists only if where κ := k 2 k is the ratio of the first two moments of the distribution [2]. Specifically, using the moments of the constant and Poisson distributions, respectively, one can deduce that the percolation thresholds for the random d-regular and Erdős-Rényi networks are 1 d−1 and 1 k . Another result that can be deduced using this formalism is that the probability of a (nondeleted) node of degree k to belong to a finite component is the probability that all of the branches emanating from it are finite, i.e., the probability is u k .

Targeted Attack Strategies
Naturally, random attacks are not expected to give optimal results. Indeed, a simple and more effective attack strategy is starting the removal with the high degree nodes, as they play a more substantial role in the network connectivity. Using a function for some k 0 and α. Solving for the values of k 0 and α that give criticality, one can find the critical fraction for removal. The attack by degree strategy is, however, suboptimal. This can be seen from the fact that for any finite k, the probability that a removed node of degree k does not even belong to the giant component, and therefore its removal is unbeneficial, is u k . Therefore, a finite fraction of the removed nodes are not included in the giant component to begin with, and thus the method is not even asymptotically optimal. Furthermore, for random regular networks, targeted attack by degree is completely equivalent to random removal.
In order to develop a better attack strategy one may consider methods such as adaptive attack by degree or attack by betweenness centrality. However, these strategies are very hard to analyse.

Bounds on optimal attacks
We define c f as the minimal fraction of nodes that are to be removed before the network is shattered (i.e., becomes fragmented into sublinear components). It is clear that for any network c f ≤ q c = 1 − p c , where p c is the percolation threshold for random removal of nodes. For a d-regular graph (where each node has degree d) this yields the upper bound However, one can ask the question: Given complete knowledge of the network, and unlimited computational power, what is the smallest fraction of the nodes that can be removed in order to break the network into small pieces, each of size o(n)? One might be tempted to consider the possibility of shattering the network by removing only a zero fraction, o(1) of the nodes. Indeed, in the case of a square grid, for example, which is a regular network with fixed degree 4, one can remove n 1/4 = o(n) equally spaced rows and columns and shatter the network into pieces of size O( √ n). For some cases, such as random regular networks, this can be shown to be impossible. Indeed, improved bounds were obtained in [18,19]: In this paper we establish better lower and upper bounds on c f , focusing on sparse random graphs. In particular for random regular graphs we show where α(G) is the independence number of G. When d is large enough the independence number is known [20] to satisfy α(G) ≈ 2n ln d/d. In this case we obtain We provide matching results for Erdős-Rényi random graphs.

Results
For the analytical results, we mainly exploit structural graph properties such as expansion, domination number and independence and obtain deterministic connections to the shattering number. We then apply known estimations of these parameters for random graphs either directly or via contiguity arguments.
As an example we establish a lower bound on the minimal fraction of nodes needed to be removed in order to shatter a regular network into small components of size o(n). Consider shattering a graph to m disjoint clusters C i , i = 1, . . . , m by deleting a set of |S| = c f n nodes. c f is therefore the fraction of removed nodes. Notice that ∑ m i=1 |C i | + |S| = n. Thus, 1 n ∑ i |C i | + c f = 1. Denote by B i the nodes on the boundary of C i , i.e., the neighbors of nodes of cluster i outside of the cluster. Since the clusters are disconnected for all i, j, all boundary nodes for any cluster i must be removed. Therefore, for every i, all nodes in B i are deleted. Now, in a random regular graph with high probability each cluster is locally almost tree like. Therefore, up to an additive constant, Since every node has exactly d neighbors, a node can not participate in more than d of the B i s.
Summarizing the above we get: This approach can be shown to give asymptotically tight solution for d = 3 (i.e., it matches the upper bound shown in [18]). However, for large values of d it deviates considerably from the exact value. Indeed, for d → ∞, Eq. (10) leads to c f ≥ 1 2 , where as will be shown below c f → 1.
In order to give a better lower bound on c f for random regular graphs with large constant degree we observe the following: Random regular graphs are locally tree like, having a bounded number of short cycles. Therefore after the network is shattered we expect the remaining components to be trees. A tree can be shattered into isolated vertices by removing at most half of its nodes. Therefore, it can be deduced that the number of nodes remaining after the attack is at most twice the size of the largest independent set. Since removing all but an independent set clearly shatters the graph, we obtain Equation (8). With some further effort we can improve the lower bound to close the gap and match the upper bound. The same approach may be applied to the Erdős-Rényi model G(N, p = c/N) where we get We discuss theoretical and practical implications of our results in Section 8.

Shattering Scale Free Networks
Targeted attack strategies usually begin by attacking the hubs of the scale free networks. This is based on the idea that the hubs are the glue holding the network together, due to their high degree and large number of neighbours. Indeed, one may use this intuition to obtains bounds on the hardness of shattering a scale free networks. Consider a scale free network with degree distribution where ζ (γ) = ∑ ∞ k=1 k −γ . This network can be shattered using the following two stage process: 1 Remove all nodes above degree d. This contains a fraction of ∑ ∞ k=d+1 k −γ /ζ (γ) of the nodes. 2 Shatter the remaining network, requiring at most d−2 d+1 of the network. One can choose d as to minimize the sum of these two terms in order to obtain an upper bound on the size of the required shattering set. In fact, a better bound can be obtained by noticing that after the removal of the hubs, the remaining nodes form a random network with degree distributioñ denoting the probability of an edge to lead to an undeleted node (i.e. the fraction of edges leading to undeleted nodes). If one can bound the size of the shattering set for a random graph with this degree distribution, a better bound on the overall shattering set size can be obtained. In particular, by showing that the remaining graph is close enough to random graph in the sense of having all local neighborhoods as almost trees with high probability, one may replace the upper bound d−2 d+1 in Point 2 above by 1 − α(G)/n.

Algorithmic aspects
Finding an optimal shattering set is NP-hard, but when the input is a random graph the problem becomes tractable. We propose the following algorithm for finding a shattering set in a graph. Below we describe the algorithm and demonstrate its asymptotic optimality for random cubic graphs. Let G ∼ G n.3 be a random cubic graph. Consider the following algorithm: Algorithm Shatter, Phase I 1 Input: a graph G and threshold t 2 Find a Hamilton cycle H in G 3 Start from an arbitrary node v 0 and advance along H creating a segment. 4 When visiting a node v, it is incident with two edges on the cycle and a third edge e. If e is the second edge in the segment going backward (in H), delete v. Each edge is seen once going forward and once going backward, and we delete nodes for half of the edges seen going backward, we remove exactly 1/4 of the nodes. This is optimal as can be seen in Eq. (10). A demonstration of Phase I is given in Figure 1.
When Phase I is complete we are left with a tree T of segments of H. These segments are with high probability of length o(n) and are unicyclic. Shattering T can be easily done by removing a center vertex of the tree, leaving at least two subtrees with at most half of the original number of segments in each. We continue in this manner until the graph is shattered, that is until every connected component is of size smaller than a predefined threshold. Summarizing we get: Algorithm Shatter, Phase II 1 Let T be the tree of segments remaining after Phase I 2 Until the maximal size of a connected component is smaller than t  Random cubic graphs are known to be Hamiltonian [21] and using a variant of ideas from the proofs from [22] we could get an algorithm finding such a cycle with high probability in time O(n 7/2 ). The rest of the algorithm (in both phases) runs in linear time. Preliminary results using a Branch and Bound approach suggest we may be able to reduce the running time of the Hamilton cycle finding module.
Notice that unlike removal by degree or by betweeness, which are local algorithms, the first phase of Shatter finds a global structure in the network. This is our main insight -we take advantage of the network randomness in order to find a global structure, then we use this structure to achieve high performance. In particular we get an asymptotically optimal constant for random cubic graphs (Fig. 2). While, as stated below, we can not guarantee optimality in all cases, we still believe this approach is favorable to local strategies, either passive or adaptive. This belief stems from the global nature of the problem, for which it is natural to suggest a global solution, and our attempt to optimize over a set rather than greedily point by point. Indeed, even when considering shattering to components of size one (i.e. finding an independent set), greedy algorithms are known to deliver poor results (in terms of approximation factor) when applied to random graphs [23].
For d-regular graphs with d > 3 the same algorithm can be applied: tour along the Hamiltonian path, and remove the node at which a second edge going backward is observed. However, finding a Hamiltonian cycle in a d regular graph with d > 3 induces non-trivial correlations between the edges, and thus the performance of the algorithm is hard to eval- Relative giant component size

Relative giant component size for shattering strategies
Alg. Shatter Random Figure 2: Comparison of the performance of Algorithm Shatter vs Random Removal (which, in this case is also equivalent to Attack by Degree), for a random 3-regular graph with 10 5 nodes.
uate. It is expected, however, that this algorithm will only retain an O(1/k) fraction of the nodes, which is suboptimal. For Erdős-Rényi graphs G(n, p) one may use a similar algorithm, with the following performance improvements: 1 Every node of degree two can be replaced with an edge connecting its two neighbours. This is true, since long chains of length O(n) occur with negligible probability in G(n, p). 2 Once the above step is performed, one can consider only the giant 3-core of the network, and perform Algorithm Shatter on it.

Summary
We provide bounds on the performance of optimal attack strategies on random networks, show that local strategies fail to achieve optimality even if used in adaptive manner and demonstrate an algorithm using global structure with optimal performance in certain situations. Our work draws the limits of feasibility for this well studied problem and shows that in some cases these limits are practically achievable. Besides their immediate value, our results may have broader implications. First, Algorithm Shatter is applicable whenever a long path or cycle may be found efficiently, e.g. when the network is based on a topological structure. Moreover, we see these results as an evidence for the "global solution to global problem" approach, and hope it will help in promoting this idea.