Empirical framework for identification of the most harmful malicious attacks on a smart grid

The aim of this paper is the identification of the most harmful malicious attacks in a smart grid with basis on the removal of buses in a particular sequence. For that, we define the Electrical Most Damaging Element (EMDE) and the Iterated Centrality Measure (ICM). The EMDE is the element that leads to the largest unsatisfied load increase after removed, in the current state of the smart grid. The ICM is a meaningful scaled centrality for iterated attacks. Attack strategies such as the IEMDE (Iterated Electrical Most Damaging Element) and the Iterated Most Central Element (IMCE) are proposed as references for evaluating the impact of failure sequences by comparison. For each fault strategy approach, the vulnerability curves as well as a scalability analysis are presented. It is demonstrated that the IEMDE approximated the N-k-ε\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N-k-\varepsilon$$\end{document} algorithm, but with reduced computational expense. Furthermore, the IMCE approach provided an efficient fault profile close to the performance of the IEMDE. Although this framework is applied in this paper to failures in buses, it can similarly be applied to other elements. Future research will be focused in applying these concepts to transmission lines.

; such measures can take into account the structure of the power grid, its power flow and impedance electrical properties. The impact of top buses removal from a power system with basis on centrality was studied in , comparing topological centralities with electrical centralities. It was demonstrated that the impact of removing buses according to topological centralities is lower than the impact of doing it with basis on electrical centralities. In addition, when comparing electrical betweenness and closeness, the removal of nodes according to closeness caused a higher impact on the path length, while the removal according to betweenness produced a significant impact on the load supply capacity and the connectivity. The topological structure and the robustness of power grids were studied in Arianos et al. (2009), where the authors introduced the concept of net-ability and generalized the geodesic distance concept for power grids. It was evidenced that the influence of failures in lines according to netability corresponds with the DC power flow calculation of overload. Additionally, research evaluating the impact of intentional attacks on power systems and how to mitigate it has been motivated by the threat of terrorism. Possible malicious attacks can be made through electromagnetic (Dehbaoui et al. 2009), informatics-based (Hawrylak et al. 2012), and physical means (Liu et al. 2013;David 2014); the targets of such attacks may also include different equipment of the power system in the areas of transmission, generation, control, monitoring and communications (National Research Council 2002). The location of the most of transmission and generation equipment represents a risk since they must be outdoors, accessible to malicious attacks (Agarwal et al. 2010;. Although the current progress in the operation and design of power grids allows the improvement of their efficiency and profitability, it also increases their complexity and stress due to the incorporation of modern technologies and energy sources (National Research Council 2002;Kinney et al. 2005). Hence, it is important that the design of modern power systems includes considerations for reducing vulnerability, correctly addressing all the security concerns. Furthermore, the power grid must comply with requisites of adaptability, intelligence and robustness in front of ill-intentioned attacks (NIST 2010). That scenario, will not be achievable without the creation and improvement of tools to model and analyze the strategies for prevention and recovery from possible threats to the power grid (National Research Council 2002).
It has been evidenced that electric power systems are robust under traditional failures, but they may be vulnerable in front of targeted attacks (Salmeron et al. 2004;Duman et al. 2017). A systematic attack on susceptible areas of the electric power system may produce a cascade failure and a possible long-term blackout, if the traditional structure of the power grid is considered (National Research Council 2002). Researchers have addressed the impact of malicious attacks in the power system with different methods. For instance, using optimization to maximize the load shedding in a power grid (Salmeron et al. 2004;Arroyo and Galiana 2005), to identify the groups of elements that can cause a blackout (Chen et al. 2014(Chen et al. , 2012, or to determine the expansion planning under deliberate attacks (Arroyo et al. 2010;Davarikia et al. 2020).
There are diverse approaches applied to determine the effects of ill-intentioned attacks over the power systems, some of them are: examination of historical records (Farrell Page 3 of 21 Albarakati andBikdash Applied Network Science (2022) 7:13 et al. 2004), fault tree analysis (Volkanovski et al. 2009), applications of game-theory to simulate possible attacks (Holmgren et al. 2007;Bompard et al. 2009Bompard et al. , 2008Jian et al. 2013;Piccinelli et al. 2017, Yuan andZeng 2020), and identification of vulnerable elements and areas using complex network theory (Panigrahi 2017;Adebayo et al. 2018). Interdependent structures, such as the cyber network that monitors and controls the power system has been considered for vulnerability analysis as well (Guo et al. 2017;Vellaithurai et al. 2015;Zhang et al. 2019;Meyur 2020). Moreover, simulations that evaluate different defense strategies have been proposed (Rose 2007;Wang 2017;Ouyang et al. 2017). Nevertheless, there is research field is still open for the development of a framework to analyze the impact of attacks and consequent failures in the power system. The research presented in this paper focuses on identifying deliberate attack sequences consisting of buses or transmission lines removals that produce the largest electrical damage measured as the extent of brownout damage. Identifying such sequences will be crucial for suggesting strategies to increase the robustness of the smart grid in front of such attacks.
Particular attention is paid to solutions that are computational efficient for a given state, because an accurate estimation of vulnerability is not independent of the current power flow and generation state of the grid. Efficiency is important because the number of possible and relevant grid states is very large. The contributions are the following: 1 Development of a framework based on the concept of relation between electrical damage and physical damage. This is an important tool for settling some key matters such as: (a) the most harmful attacks; (b) the most predictive centrality measure; (c) the most reliable physical damage measure; and (d) the vulnerability level of a grid compared to other. 2 Introduction and application of various malicious attack algorithms, namely the Iterated Electrical Most Damage Elements (IEMDE) and the Iterated Most Central Elements (IMCE). 3 Efficient identification of the most malicious attacks according to the framework of electrical damage versus physical damage. 4 Identification of the fastest methods for quantification of the most harmful attacks in terms of computational complexity.

Background
In this research, the estimation of unsatisfied load after each fault has been carried out using the simulation tool Matpower (Zimmerman et al. 2015). This is a high credibility package oriented to research and education based on MATLAB (http:// www. mathw orks. com/) for the solution of power flow and optimal power flow problems (with flexible options and different algorithms) among other functions. The normalized ULTotal Unsatisfied LoadUnsatisfied Load (UL) represents the proportion of power demand not met by the available generation. This is similar to energy not-supplied (Hashemi-Dezaki et al. 2015) or load shedding measures (Correa and Yusta 2013). The UL for a particular fault profile, assuming the removal of the first k buses as part of the fault evolution, is calculated as where β denotes the vector of ordered buses of particular fault profile, in MATLAB notation 1 : k is the vector of bus indices from 1 to k, PSSatisfied Power LoadPS(n i ) denotes the satisfied power load of island n i formed after the removal of buses, and PDTotal Power DemandPD is the power demand of the whole system. A total blackout corresponds to an maximum level of unsatisfied load UL(N ; β) = 1, and may result after removing a limited set of elements, for instance removing all generators.
The fault profiles can be classified into two types: natural faults, due for instance to hurricanes, and malicious attacks. There is a variety of such attacks depending on the means (cyber vs. physical), extent geographically (local vs. global), and random versus targets. For random attacks, the elements are removed according to a typically uniform probability distribution. For targeted malicious attacks, it is assumed that the attackers might have information about the power grid such as the topological or electrical structure, electric features and system limitations. It is logical to assume that if the attackers have sufficient information about the power grid, the attacks could have a larger impact, as they could determine critical points of the network.
Attackers might access the information of topological structure through particular companies [e.g., Platts (Platts 2014)], and might estimate electric characteristics of components such as impedance using standard values and typical calculations. However, the tolerances of the system are hardly available and are difficult to be clearly known by attackers (Kinney et al. 2005;Wang and Rong 2009;Wang et al. 2011;Zhu et al. 2014). Therefore, attack strategies can be classified into those having access to the tolerance of the system and not having access to such tolerance. While the unknown system tolerance strategies are based on degree, load, risk of failure and load distribution vector, the known system tolerance strategies are based on percentage of failures and exhaustive search approach (Kinney et al. 2005;Wang and Rong 2009;Wang et al. 2011).
A popular centrality-based attack is the Remove Most Central Element First (RMCE-FRemove Most Central Element FirstRMCEF) fault profile, where the attacker is assumed to have knowledge about a centrality score of the power grid's elements. In the RMCEF attack, centrality scores are computed using one of the standard techniques such as those in Eqs. (2)-(4) based on a a weighted or unweighted adjacency matrix that represents the structure of the grid. The buses are sorted according to their centrality scores from high to low and afterward they are removed according to such order.
There are 3 popular definitions of centrality: where j a ij represents the sum of the weights from the links connected to node i, i j a ij is the sum all the elements of the adjacency matrix A, u j is the j th element of the eigenvector of A corresponding to the largest eigenvalue max , σ jk is the number of shortest paths between nodes j and k, and σ jk (i) is the number of these shortest paths between j and k, passing through node i.
For a selected centrality, we denote with c(i) the centrality calculated for node i, normalized to obtain i c(i) = 1 . In this paper centralities are based in Power Traffic Matrix (PTM) which is the weighted adjacency matrix with weights that represent the active power flow on each link of the power grid. Other measures have been proposed in Cuadra et al. (2015).

Framework
The discussion about the methods to assess the vulnerability of power systems has been extensive in the last decade. Important part of the research has been dedicated to study the physical damage on power grids, by accounting it through different metrics related to their physical structure such as degree clustering coefficient (Albert et al. 2004), average path length (Albert et al. 2004), degree centrality ) and size of attack (Brummitt et al. 2012). On the other hand, several research works have studied the electrical damage, using metrics such as loss of power ( The distinction between physical damage measures and electrical damage measures is essential to settle criteria for vulnerability assessment in power systems. The measures of physical damage intend to characterize the size of the attacks by accounting the elements or the structural connections affected, while the electrical damage measures are related to the effect over the electrical performance of the power grid. The concept of vulnerability in essence attempts to measure the degradation of the performance depending on the size of the attack. For instance, an exceptionally robust grid can withstand severe physical damage presenting very low degradation in electrical performance. In this sense, the proposed framework integrates physical damage and electrical damage measures with the aim of clearly and unambiguously defining: (a) the most harmful attack; (b) the most predictive centrality measure; (c) the most reliable measure of physical damage; and (d) the vulnerability level of a grid compared to other. Figure 1 is presented as an illustration of this approach. To define the curve C 0 for a particular power system, the procedure consists in designing an attack sequence by selecting a measure of physical damage, a fault profile, and a measure of electrical damage. Once the attack sequence is designed, the electrical damage is measured using an simulation. For example, we can select NOE as physical damage measure, UL as electrical damage measure and an attack profile based in RMCEF, then use an empirical simulation of DC power flow or cascading failures to determine the electrical damage measures for the attack profile. In this manner, different power grids, damage measures and fault profiles according to centrality measures can be compared using a vulnerability curve. In Fig. 1, the vulnerability curve C 1 is higher than the baseline curve C 0 , depending on the whole attack design, we can obtain relevant information from these curves: (a) Assuming that C 1 is the a vulnerability curve for grid A, and C 0 is determined for grid B with the same attack design, it can be concluded that the grid B is less vulnerable than the grid A, because a similar physical damage produces a lower electrical damage in the grid of B.
(b) If we consider that C 1 was obtained with a different fault profile than C 0 , it implies that this attack profile is more harmful.
(c) Considering that C 2 or C 3 was obtained like C 0 , but with a different measure of physical damage implies that such measure is potentially more unreliable, because it appears that more intense attacks (according to physical damage) are required to obtain an equivalent electrical damage.
(d) And, if we assume that C 1 in the same way that C 0 , but the electrical damage is measured through a different empirical simulation, then it is evident that such simulation exposes more vulnerabilities in the power grid than the one applied for calculating C 0 .
In general, the increment on the VPM is a quantitative measure of the degradation of robustness of the grid.

Proposed attack profile
Two different types of malicious attack profiles are introduced and described in this section.

Iterated attack based on the most central element (IMCE)
The IMCE is introduced as an attack in which the element with the highest centrality in the current grid is attacked and removed. The main feature is that the centrality score is recalculated after the removal of an element. The idea under this attack profile is that the centrality measures change after the removal of the most central element (MCE), thus the second most central element in the initial ranking may not be the most central once the MCE is removed. Then the vector of centralities must be recomputed to obtain the new ranking of elements according to centrality. For this matter, we define a sequence of grids as Ŵ 0 , Ŵ 1 , Ŵ 2 , . . . , Ŵ N where Ŵ i is the resulting grid from the removal of the MCE in the grids Ŵ i−1 . Also, ζ j (Ŵ i ) is the centrality score of the element j in the grid Ŵ i , and e i represents the index of the MCE of such grid, as follows Furthermore, the value of centrality is denoted as z i , as follows And the subsequent grid, in the corresponding sequence, is defined as It is noted that ζ j (Ŵ i ) is not normalized, thus it is necessary to determine a normalized centrality vector, given by This is called the Iterated Centrality Measure (ICM), considered the most meaningful scaled centrality generated by the IMCE, In the case that a collapse happens at an iteration i = ℓ , or in case that the attack ends at i = ℓ , the centrality of the remaining elements is equal to the so far unassigned centrality. That is, For a better physical interpretation of the proposed centrality Z i , it is wanted that it is a monotonically decreasing sequence. In this sense, it can be demonstrated that the 1-norm of the sequence is equal to one by construction, i.e., �Z� 1 = j Z j = 1 . This means that the sum of centralities is equal to one, and they are almost monotonically decreasing without the need of applying any sorting. Algorithm 1 presents a pseudocode that describes the proposed approach to find the removed elements, the unsatisfied load, and the corresponding ICM. In order to exemplify the IMCE previously described, the IEEE 14-bus test power system is employed. In terms of the degree centrality, the bus 2 is the most central, with a normalized degree score equal to 0.25. That bus is removed, and the modified grid Ŵ 1 that is shown in Fig. 2 is obtained (the black dashed lined represents the links that are removed as consequence of removing bus 2). Then, the degree centrality score is recalculated for such grid, resulting that the most central bus is the bus 5, with a normalized degree score equal to 0.20. Therefore, it corresponds to remove such bus, leading to the new grid Ŵ 2 , as shown in Fig. 2 (the red dashed lines represent the links that are removed when bus 5 is attacked). In Table 1 the initial sequences of buses removed under IMCE using different centrality measures are presented for the IEEE 118-bus system and the IEEE 300-bus system, and it is noted that the sequences are different for different centrality measures.  · · · · · · · · · · · · · · · Page 9 of 21 Albarakati and Bikdash Applied Network Science (2022) 7:13 Furthermore, we suppose for instance, that collapse occurs after removing two buses. Then, and the next Z scores are determined as Therefore, we obtain where it is noted that Z i are not sorted in this equation. Figure 3 shows the logarithm of the ICM for the IEEE test power grids of 118 and 300 buses, where it is observed to be monotonically decreasing. (11)

Iterated attack based on the electrical most damaging element (IEMDE)
The IEMDE is defined in this work as an attack that is based in the removal of the element that generates the largest raise in the UL of the power grid in the current state. The iterated attack generates a sequence of grids in which the electrical most damaging element (EMDE) is removed, such sequence is Ŵ 0 , Ŵ 1 , Ŵ 2 , . . . , Ŵ N , where Ŵ i is the resulting grid after the removal of the EMDE of Ŵ i . That means, and the iterated EMD centrality is In the same way that for the IMCE, we observe that �z� 1 = j z j = 1 . Figure 4 shows how these centrality scores are almost monotonically decreasing for the IEEE 118-bus and 300-bus test power systems. Algorithm 2 illustrates the proposed approach for computing the IEMDE using pseudo-code. Also, as an example of the application of this approach, we employed the IEEE 14-bus test grid. When the Algorithm 2 is applied, we obtain the sequence of EMDE's [3, 4, 5, 1], which are the buses to remove until the grid collapses. The grid is shown in Fig. 5, and the links removed for the IEMDE are denoted with dashed lines.

Fig. 4 Proposed iterated EMD centrality
Page 11 of 21 Albarakati and Bikdash Applied Network Science (2022) 7:13 In addition, the IEMDE malicious fault profile was applied to the IEEE 118-bus and the IEEE 300-bus systems, and the initial sequence of EMDE's are shown in Table 2. It is  · · · · · · · · · Page 12 of 21 Albarakati and Bikdash Applied Network Science (2022) 7:13 noted that the removal of each of these elements leads to an important increment in the UL, and the EMDE ( Ŵ 0 ) produces the highest z. The framework proposed here has been though for attacks on buses, lines, or combination of both. To illustrate this, Table 3 shows the transmission lines that generate the highest damage for the IMDE applied to the IEEE 300-bus system. It is noted that the first EMDE does not present the largest value of z , in fact z 1 = 0.0375 < z 7 = 0.0585, which matches with our intuition. In this sense, it is expected that the grid is initially strong and able to tolerate the failure of one element, but as the grid losses elements it becomes more vulnerable. That explains why, in the sixth stage of the IMDE with the removal of the corresponding EMDE a higher impact is obtained.
The attack profiles IMCE and IEMDE proposed here are somewhat similar to the one considered in Zhu et al. (2014) with an approach on sequential cascading failures. In that work, all possible attack sequences are considered and the Sequential Attack Graph is constructed from them. Nevertheless, one limitation of such proposal is the huge amount of cascading failures to consider, for example, the analysis of the IEEE 30-bus system requires the simulation of about 24,000 cascading failures.

Comparing various attack profiles
In the following, the Vulnerability Prediction Measure (VPM) is revised for the fault profiles with the algorithms described in Section Propsed Attack Profiles. Furthermore, several attack strategies are compared using this approach.

Analyzing the vulnerability curve and VPM of the fault strategies
The vulnerability curves resulting of the IMCE attack applied to the different test power grids and for every centrality measure selected are presented in Fig. 6. It is noted that the curves for eigenvector centrality and degree centrality are very close to each other and both present higher values of unsatisfied load than the betweenness centrality. This agrees with VPM scores shown in Table 4, which are very similar for the eigenvector and · · · · · · · · · Page 13 of 21 Albarakati and Bikdash Applied Network Science (2022) 7:13 degree centrality. This also implies that the IMCE attack under such centralities is more damaging than under the betweenness, in fact the removal of approximately a third of the elements leads to a complete blackout. For the IEMDE attack profile, the vulnerability curves corresponding to the test systems is shown in Fig. 7. These curves are steeper compared to the ones from IMCE attack, indicating the severity of the IEMDE, and this is confirmed with the VPM scores in Table 5, which are higher than the ones in Table 5. The IEMDE profile in the studied power grids leads to a blackout after the removal of less than the 20% of the buses.
The IEMDE, compared to other attacks in this research, is the most severe, except for fault sequences identified by the N − k − ǫ algorithm. Therefore, this attack profile (IEMDE) can be employed as a reference for comparison in order to test the harm intesity of different attack profiles.

Comparison of attack profiles based on the vulnerability curve and the VPM
This section presents a comparison between the attack profiles proposed in this work (IMCE and IEMDE), the Remove Most Central Element First (RMCEF) attack and the worst case random (WCR) attack. Here, the random attacks strategy is implemented through a Monte Carlo simulation consisting in the removal of a permutation of buses on each realization and the measurement of the corresponding unsatisfied load. The VPM is calculated for each random sequence (permutation) after the removal of every bus, and the WCR denotes the scenario with the highest VPM of all the realizations (400 realizations were performed). Several observations are made from the vulnerability curves corresponding to the aforementioned attack strategies (Fig. 8): (a) IEMDE attack produces the steepest vulnerability curve in comparison with RMCEF, WCR and IMCE, (b) IEMDE attack vulnerability curve is smoother than the rest, (c) IMCE curve is close to the IEMDE curve, and (d) for larger grids (IEEE 300-bus), the vulnerability curves are more predictable and have less abrupt variations.
In Table 6 the VPM scores for the different attack strategies simulated are presented. The attack profiles that consider centrality measures are based in PTM centralities and may not hold for other centralities. The results show that the VPM for the IMCE attack with basis on PTM degree centrality is similar to this score with basis on PTM eigenvector centrality for both IEEE systems studied. Therefore, given such similarity, it is preferable to use the degree centrality as it is less complex to calculate.
The results for the PTM degree centrality RMCEF show VPM scores of 0.891 and 0.854 for IEEE-118 and IEEE-300 respectively. These are not far from the VPM scores obtained for the IMCE (0.900 and 0.905 respectively) and for the IEMDE (0.958 and 0.954 respectively). Then, the RMCEF fault profile with degree centrality provides good  understanding about the most damaging elements on the system. Nevertheless, the scores obtained with RMCEF under PTM betweenness and eigenvector are considerably lower and do not bring such information.
In addition, the VPM scores corresponding to the IEMDE attack profile for the IEEE-118 bus and the IEEE-300 systems are comparable. This happens also in the scores for the IMCE attack with PTM degree and the fraction of removed elements (FOE).

Alignment with the IEMDE sequence
It is important to introduce a procedure to compare two attack profiles to determine the most damaging sequence. In this section, we propose the Attack Sequence Misalignment (ASM) measure computation as a method to find the similarity between two  different attack profiles. Considering two sequences of elements removed, the difference in the location of an element in both sequences is determined. Such difference is averaged over the top elements in what is defined here as the primary sequence. The IEMDE sequence is the primary sequence of the most elements. This is needed because of the electrical collapse after the most damaging elements are removed. Figure 9 exemplifies this concept. Note that for IEEE-118 Bus the ASM shows 5.90 for the IEMDE as a primary sequence , and IMCE D as a secondary sequence. The calculation of ASM was performed for attack profiles IMCE and RMCEF with different centralities leading to Table 7. Results show that the most similar sequence compared with the IEMDE is the IMCE with degree centrality followed by the IMCE with eigenvector centrality. While the IMCE and RMCEF fault profiles according to betweenness centrality presented the worst ASM.

Computational cost of attacks
In the subsequent, the time complexity of the proposed attack strategies is discussed. The least complex algorithm from the ones studied in this research is the RMCEF as it is not iterative to determine the elements to remove. The proposed algorithms and  the N − k − ǫ approach are shown in Fig. 10 in order of time complexity, from the lower to the highest. Newton-Raphson power flow calculation is one of the most costly operations performed. Regarding the time complexity of the Newton-Raphson solution for the power flow, it has been demonstrated that for a sufficiently close initial point, it converges with a quadratic rate. The execution time and the convergence of the AC power flow with Newton-Raphson depends on different aspects such as the number of buses of the grid, its structure and the initial values selected for the unknown variables. For a fully connected system with N buses it converges approximately in O (N 3 log N ) . Nevertheless, the solution of AC power flow with Matpower uses the sparsity of power systems to improve such complexity to O(N).
If the time complexity of the proposed fault strategies is evaluated in function of the power flow calculation as an elemental operation, then RMCEF entails only one power flow computation and one centrality computation, thus its complexity is O(N). It is noted also, that IMCE requires the calculation of N times the power flows, and N times the centralities of the system. Furthermore, the IEMDE complexity is O(N 2 /2) as it requires the calculation of power flows N 2 times. And the Monte-Carlo simulation of random failure sequences for n realizations will require the computation of n power flows. Comparing the results of execution time from the proposed attack strategies with the results replicated from Chen et al. (2014) with the application of the N − k − ε strategy, it is noted that the proposed algorithms present lower execution times in terms of scaling. This is shown in Fig. 11, where the N − k − ε algorithm executes in O(N 3 ) approximately, which is more time expensive than the proposed algorithms.

Conclusions
This research presents an empirical framework for identifying and analyzing malicious attacks according to the impact in the power system. Concepts such as the ICM(z), the EMDE, and the ASM, which help to define and compare the most harmful attacks, are introduced. Moreover, the attack strategies IMCE and IEMDE are proposed for evaluating the impact of failure sequences by comparison.
The main contribution of this research was the identification of the malicious attack strategies that are more harmful for a smart grid by the removal of a sequence of its buses. As a result of the comparison of different attack profiles using the IEEE 118 bus and 300 bus test systems and the proposed framework, it is demonstrated that the IEMDE attack is the most harmful attack strategy, in terms of the VPM. This attack strategy presented a higher VPM than the WCR, the RMCEF attack and the IMCE attack sequences, and represents an approximation to the the N − k − ε attack strategy with a lower computational effort.
In addition, it is shown that the IMCE attack strategy with degree and eigenvector PTM centralities are the most similar to the IEMDE fault profile. This means that such attack strategies can be useful for predicting harmful attacks with a reduced computational complexity. Although this approach is applied here to failures in buses, it can be also implemented to failures in different elements of the power grid. Future research will be focused in applying these concepts to transmission lines.