Engineering Structural Robustness in Power Grid Networks Susceptible to Coherent Swing Instability

Networked power grid systems are susceptible to a phenomenon known as Coherent Swing Instability (CSI), in which a subset of machines in the grid lose synchrony with the rest of the network. We develop network level evaluation metrics to (i) identify community substructures in the power grid network, (ii) determine weak points in the network that are particularly sensitive to CSI, and (iii) produce an engineering approach for the addition of transmission lines to reduce the incidences of CSI in existing networks, or design new power grid networks that are robust to CSI by their network design. For simulations on a reduced model for the American Northeast power grid, where a block of buses representing the New England region exhibit a strong propensity for CSI, we show that modifying the network's connectivity structure can markedly improve the grid's resilience to CSI. Our analysis provides a versatile diagnostic tool for evaluating the efficacy of adding lines to a power grid which is known to be prone to CSI. This is a particularly relevant problem in large-scale power systems, where improving stability and robustness to interruptions by increasing overall network connectivity is not feasible due to financial and infrastructural constraints.


I. INTRODUCTION
D ISRUPTIONS of power grid systems can have a severe, negative impact on performance and lead to the Coherent Swing Instability (CSI) [1]- [3], whereby a subset of machines in the grid lose synchrony with the rest of the network, thus shutting the entire network down and leading to unacceptable blackouts. To address the network stability, real-time monitoring strategies for the power grid have been developed [4]- [6], with event location strategies gaining increasing attention in order to localize pernicious events [7]- [9]. These strategies are directed to provide a system-wide awareness of events such as faults and other disturbances, taking advantage of the increasing coverage of wide area measurement systems (WAMS) technology which enables the implementation of wide area emergency and restorative control applications [7]- [9]. Even though these strategies have achieved positive results, additional technical challenges arise as the modern WAMS-generated data become high dimensional and more distributed thorough large areas of the system. We propose an additional technique for power grid network robustification. D Specifically, we show that the robustness of the network can be diagnosed from its connectivity (graph) structure. Moreover, the power grid can be made significantly more robust to disturbances with proper engineering of the network design. Such considerations are critical in considering future power systems deployments, or for upgrading current networks in order to circumvent susceptibility to CSI.
Modern WAMS-generated data are engineered to create situational awareness by combining and analyzing high dimensional and distributed information provided by several regional control systems [4], while understanding the compromise between the reduction of data dimensionality and the communication capacity [5]. These distributed wide-wide area monitoring approaches offer advantages in terms of various aspects, such as: financial resources, communication, latency, reliability and security compared with centralized counterparts [10]. Additionally, they allow the use of distributed processing strategies to alleviate the burden and complexity of processing the mass of information in a centralized fashion [10], [11]. Wide-area data fusion architectures that can be applied for hierarchical and distributed systems have started to emerge in order to surmount these challenges. Applications ranging from mode-meter algorithms [12]- [14] and coordinated adaptive control [15], [16], are only some examples of methodologies that have begun to migrate toward configurations more suitable for wide-area monitoring.
Despite the technological advances in understanding WAMS-generated data and its implications, it still does not address a fundamental issue of why the power grid, or parts of the power grid, are so susceptible to the CSI in the first place. Specifically, there is significant diversity in the onset of CSI from disturbances of nodes in the power grid network. For this work, we consider the Northeastern Power Coordinating Council (NPCC), which is illustrated in Fig. 1, is a 48-machine 140bus test system of the northeastern United States where some nodes are highly tolerant to disruptions, while other nodes lead to CSI with the briefest of interruptions and/or disturbances. As we will show in our network simulations, this diversity of unstable responses is due to the network architecture itself, with certain nodes being highly susceptible to CSI under disruption, and others being quite robust to disruption. This suggests that engineering the network connectivity structure is critical for overall robustness of the network. We develop a principled way to not only evaluate the robustness due to each node in the power grid, but also show how the network architecture itself can be made more robust by engineering nodal connections between highly unstable nodes and nodes that are highly robust. Such a strategy can be used to not only design new power grid architectures, but can also inform investment strategies for existing networks so as to make them more robust to instabilities.
The papers is outlined as follows: In Sec. II, the model used for simulating the power grid dynamics and disruptions is detailed. Section III characterizes the network communities that result from the architecture of the northeastern power grid of the United States. Special emphasis is placed on detecting the community structures induced in the network. The sensitivity of the network to CSI is exhaustively explored in Sec. IV, with a ranking of the network nodes and their susceptibility to CSI given. Engineering robust through the network architecture is explored in Sec. V. A brief summary of our findings and an outlook of our method for the engineering of power grid systems is given in Sec. VI.

II. NUMERICAL SIMULATIONS OF POWER GRIDS
In this section, we highlight the numerical simulation architecture used to evaluate power grid systems and their connectivity structure. Importantly, a prescription of the disturbances applied to the network to induce CSI is considered in order to evaluate the network robustness in a principled way.

A. The Power System Toolbox
Power grid simulations for this study were produced using Power System Toolbox (PST), a Matlab software package originally developed by Kwok W. Cheung and Joe Chow of Rensselaer Polytechnic Institute [17]. Supplied with both the topological structure of a power network and the specific electromechanical parameters of the grid's generators, nodes, and lines, PST performs dynamic simulations according to the protocols outlined in what follows.
1) Load Flow Solution: Taking as input the real powers of the system's generators and the active and reactive power of its loads, PST solves the nonlinear algebraic network equations which specify a steady-state solution for power flow through the network. In this state, all bus voltages are constant (up to standard AC potential oscillations). The simulations in this The system is in a steady state until a fault is applied at 0.1s. The fault is cleared in two stages, at 0.15 and 0.2s. After the fault has been cleared, the system has been perturbed from its steady state so it continues to evolve dynamically. study initialize the system in this steady state before applying a perturbation.
2) Fault Application: The dynamic portion of the simulation is initiated by applying a transient three-phase fault to a single line in the network, as though the line were brought into momentary contact with a grounding object such as a tree. As soon as the system is perturbed from its steady state, PST begins stepping forward in time using a predictorcorrector scheme in which bus states are updated according to generator and excitation system models specified in the network configuration input. Figure 2 illustrates the temporal dynamics of the bus dynamics after application of a fault.
3) Fault Clearing: When a fault occurs, power system protection equipment acts to isolate the disturbance. If the fault is transient, the line can be reconnected after a short time. PST treats this as a two-step process, clearing the fault first at the near end and then at the remote end of the line. These two time intervals, labeled in this paper as τ 1 and τ 2 , are supplied as input parameters in simulations. Figure 2 illustrates the two time scales, τ 1 and τ 2 , for a simulation of the northeastern power grid system. Our interest is in the dynamics following the fault application and fault clearing time scales.

4) System
Response: When the fault is cleared, the grid recovers its full original network structure. The fault has perturbed it from its initial steady-state configuration, so dynamic evolution continues. We continue the simulation for a long time (relative to the fault duration) and analyze the network's response to the disturbance. The CSI often is induced by a disturbance event (fault application) for which the power grid system does not recover to its original stable (steady-state) behavior.

B. The NPCC 140-Bus System
Simulations in this study are carried out on the NPCC 140bus test system, which is a reduced model based on the power grid of the North American northeast (Fig. 1). This network was chosen because its machine parameters are representative of those in a major real-world grid and its graph structure is sufficiently large and complex that it gives rise to coherent dynamics at subnetwork level.

III. DISCOVERING COMMUNITY STRUCTURE IN THE GRID
Simulations of the northeastern power gird are sufficient to illustrate many of the key features of power grid networks and their induced CSI. By perturbing the various nodes of this specific network, we can characterize the instability structures, and their commonalities, induced in the power grid dynamics.

A. Identifying Coherent Swing Instabilities
The focus of this specific study is the phenomenon of CSI, in which a subgroup of buses which are strongly coupled to one another, but only weakly coupled to other nodes, collectively lose synchronicity with the remainder of the network. Real-world power systems implement controllers to damp the oscillations of relative rotor angles which give rise to CSI, but no such safeguards are implemented in the PST simulation toolbox. The onset of CSI therefore is manifested as a clear qualitative transition in the dynamics of a subset of the buses. Specifically, a group of machines will begin to oscillate with linearly increasing frequency while the remainder of the network continues to evolve with dynamics on a slower and roughly constant time scale (Fig. 3). Although the PST model does not accurately model the behavior of the unstable network, since local controllers would activate to dampen growing oscillations, it does highlight the lack of robustness of the network to the intrinsic dynamics induced by the disturbances. By engineering a more robust system, the intrinsic dynamics itself acts to stabilize the system. This

Fault Location
Unstable Buses } } Fig. 4. A systematic exploration of the network locations of unstable dynamics as a function of the line where the initial fault was applied. Each column represents a different simulation with a different fault location, with yellow pixels denoting which buses exhibited instability during that run. The group of buses which tend to lose synchrony with the network together are plotted in blue, with the lines that set off this CSI in red. aspect of engineering a power grid network is considered in Sec. V.

B. Community Detection
We begin by investigating the incidence of CSI by systematically applying faults to each line of the network in succession and tracking which buses (if any) exhibit unstable dynamics as a result. The results, plotted in Fig. 4, immediately suggest that the majority of instabilities take place in a particular subgroup of buses. This is particularly evident because the numbering scheme for network nodes and edges tends to group geographically located neighbors sequentially. Thus the eastern portion of the grid is already fairly localized on the plot. The community structure could be easily identified for any arbitrary numbering scheme, however, using a standard community detection algorithm [18], [19], we thus identify the New England subnetwork, i.e. the greater Boston region, with nodes colored in blue in Fig. 4. These buses are strongly coupled to one another but comparatively weakly coupled to the rest of the network. Indeed, our community detection immediately identifies and distinguishes between the greater Boston region and the rest of the northeastern region, with the Boston region being particularly susceptible to CSI (compare the network given in Fig. 4 with its geographical map Fig. 1). Each box represents a simulation with fault times (τ 1 , τ 2 ), with its color denoting the number of generator buses which exhibited unstable dynamics. A set of simulations like this was carried out for each line in the network. Fig. 5a shows results for faults at line #4, which is in the New England subnetwork (colored red in Fig. 4). Fig. 5b shows results for faults at line #172, elsewhere in the network (colored black in Fig. 4). Note: the color scale is produced by ennumerating unstable generator buses (just those marked by squares in Fig. 4) so a count of 9 corresponds to the whole New England subnetwork.

IV. THE SENSITIVITY OF NETWORK CONNECTIONS
Given the diversity of dynamics observed for disruptions of the network, our analysis aims to understand the sensitivity of each node in the power grid to fault tolerances. By varying the fault severity, we can rank the nodes and their susceptibility to CSI.

A. Varying Fault Severity
To identify the lines where a system fault is most likely to generate CSI, we measure responses to faults of varying intensity. Faults in PST simulations are parameterized by two time durations: τ 1 from the application of the fault to the clearing of the near end, and τ 2 from the clearing of the near end to that of the remote end. Generally speaking, a longer fault duration drives the system farther from its initial steady-state configuration, increasing the likelihood of instability. Indeed, the parametrization of faulty intensity through the (τ 1 , τ 2 ) parameter space allows us to characterize the robustness of each node.

B. Ranking Lines by Sensitivity
Working in (τ 1 , τ 2 ) parameter space, we identify a domain which captures the onset of instability for most fault locations. This gives a range of values broad enough so that the lowest values of τ 1 , τ 2 yield fully stable dynamics, while the highest values of τ 1 , τ 2 lead to instability at many fault locations. For each fault location, we repeat simulations over a mesh in this domain of parameter space. The performance of each run is quantified by determining the number of buses which go unstable.
The result of this analysis gives an "instability frontier" in the space of (τ 1 , τ 2 ). As visualized in Fig. 5, the more sensitive lines of the network (e.g. Fig. 5a) have a frontier which extends farther down toward the bottom-left corner, whereas more robust lines (e.g. Fig. 5b) are fully stable until comparatively high values of (τ 1 , τ 2 ). Moreover, the most sensitive lines tend to trigger the CSI in the New England subnetwork identified in Section III. As soon as the fault severity reaches some threshold intensity, the whole community goes unstable coherently.
Having performed exhaustive simulations for all possible fault locations, we rank lines according to sensitivity by averaging over the unstable nodes obtained at all (τ 1 , τ 2 ) parameter combinations. The numerical values obtained are of course sensitive to our choice of the fault-time domain, but they serve as a functional metric of comparison between different lines tested on this domain. Figure 6 illustrates the results of this process, with the 20 most sensitive lines highlighted in red. These are the lines whose performance we seek to improve through the remainder of this study. The network modifications we consider in the next section belong to a combinatorially large space, so it is necessary to restrict the scope of our analysis wherever possible to avoid having to do a prohibitively large number of simulations. Restricting simulations to these "worst offender" fault locations allows us to constrain the parameter space while still treating the cases which are of greatest practical concern with respect to grid stability.

V. ENGINEERING NETWORK STRUCTURE TO REDUCE CSI
Although the specific criteria which lead a subgroup of buses to coherently desynchronize in a given network are not generally well understood, it is clear that the observed phenomenon of community structure is intimately related to the connectivity configuration of the network. Grids are prone to CSI when they contain a subnetwork which is relatively weakly coupled to its surrounding nodes. Thus a naive approach to engineering network stability would be to simply add connections between nodes inside and outside this instability prone community. The results we present in this section not only support this intuition, but also show that not all inter-community line additions yield significant improvements to stability. As such, the full simulation-based approach implemented here is necessary to determine which inter-community connections contribute the most to the grid's structural robustness.

A. Network Modification Protocol
The approach for assessing how the addition of a transmission line affects the incidence of CSI is as follows: 1) A line connecting the two chosen buses (with resistance and reactance specifications taken to be the median of those of the existing lines) is inserted into the PST network specifications. 2) For each of the most sensitive lines in the unmodified network (i.e. those highlighted in Fig. 6), simulations are carried out for all (τ 1 , τ 2 ) fault-time combinations in the domain identified in Section IV-B to obtain an instability frontier (denoted in Fig. 7 by a black dotted line). 3) Overall performance for each fault location is again obtained by averaging over the sum of unstable generators. These values are then averaged over all tested fault locations to obtain a single plot of the modified network's susceptibility to CSI (shown in the right column of Fig. 7). 4) The result of this process is compared to that of the unmodified network to obtain a ratio measuring the stability improvement afforded by the added connection.
The set of possible single-line additions to the network is combinatorially large (N additions = N bus (N bus − 1) − N line = Fig. 8. The NPCC 140-bus system with colored lines representing candidates for additions to the network. Their coloring denotes the extent to which they improved network stability relative to the original network. Green lines afforded the greatest improvement, while red lines left performance largely unchanged. The nodes of the New England subnetwork are colored blue for reference.
19367, in the case of the NPCC-140 system). The number of simulations necessary to carry out the above steps for a single network modification is such that it is not computationally feasible to test all possible cases. Instead, we use a judiciously chosen subset of available modifications as a proof of concept, with the understanding that any practical implementation of this process for a real power grid would likely have a highly constrained set of possible line additions due to logistical considerations.

B. Network Modification Results
The candidate lines tested using the above protocol are pictured in Fig. 8, colored according to their performance relative to the original network configuration. We observe that all lines which exhibited significant improvement to stability are between the New England subnetwork and the rest of the grid (i.e. between blue and black nodes). Intra-community connections tested (blue → blue or black → black) yielded little to no improvement.
For the line candidates which do connect the two communities the criteria for significant robustification are not obvious. Nonetheless, we have successfully identified a number of lines which do offer a marked improvement even by testing a very small subset of all possible inter-community connections. This shows that our approach to evaluating robustness of the network can effectively identify potential connections capable of robustifying the power grid network.

VI. CONCLUSION
This paper proposes a computational framework for the analysis of the network level dynamics and stability of power system disturbances. Our analysis is critical for understanding how the network architecture itself can lead to subgraphs (communities) that are highly susceptible to CSI. By systematically parametrizing disturbances according to the temporal parameters τ 1 (the time of the fault until it is cleared) and τ 2 (from the clearing time of the near end to that of the remote end), we can assess the effect of each node on the overall stability of the power grid network.
We specifically develop evaluation metrics to (i) identify community substructures in the power grid network, (ii) determine weak points in the network that are particularly sensitive to faults, and (iii) produce an engineering approach for the addition of transmission lines to maximally reduce the incidence of CSI. For our example of the Northeast power grid, we identify a strong dependence of line sensitivity on the New England subnetwork. We show how modifying the network's connectivity structure can robustify the network to CSI. The space of possible connectivity changes is combinatorially large, so we restrict modifications to a tractably small subset of single-line insertions to the original network. We find that the addition of a line can markedly improve the grid's resilience to CSI, but the success is highly dependent on location within the network.
Our analysis provides a versatile diagnostic for the efficacy of adding a particular line to a power grid which is known to be prone to CSI. This is a particularly relevant problem in largescale power systems, where improving stability by increasing overall network connectivity is not feasible due to financial and infrastructural constraints. Our approach focuses principally on network topology, so its results are fairly robust to small variations in the model parameters used in simulation. This makes it a strong candidate for use in analyzing real-world power systems, as connectivity structure is a characteristic which can be perfectly reproduced in the translation from physical system to simulation model.