A common model used to describe social learning and influence in networks is called the DeGroot model (Golub and Jackson 2010; Huckfeldt et al. 2014; DeGroot 1974). This model simply states that a node’s opinions over a certain topic will evolve as a weighted average of their neighbors’ opinions, so that the dynamics of the opinion vector *x* take the form:

Where \(\dot {x}\) refers to the time derivative of the opinion vector *x*, and *A* is an *n*×*n* weighted adjacency matrix with *A*_{ij}>0 if node *j* influences node *i*, and *A*_{ij}=0 otherwise. A key feature of this type of system is that it is controllable. When under control of an outside party, the dynamics of the system become:

$$ \dot{x}=Ax + Bu $$

(2)

Here, the *n*×*m* matrix *B* represents a control schematic with *m* external controllers, so that *B*_{ij}=1 if node *i* receives control input from controller *j*, and *B*_{ij}=0 otherwise. Once the schematic is known, a vector *u* of control inputs is sent to the nodes. Now the key distinction between controllability and corruption is made. In a corrupt network, nodes receive some monetary transfer in exchange for altering their stated position or opinion on some certain topic and influencing their neighbors in the network. Their value of control input is given by a function *J*_{i} of the expected control input received by node *i* and the “corruptibility” parameter *η*_{i}. Alternatively, this can be seen as the price that this node will charge in order to allow for control input signals to alter their state. Thus, to a node in the network, corruptibility *η*_{i} and control capacity \(B_{i}=\sum _{k=1}^{m} \mathbb {E}[B_{ik}]\) are complements. Let *J*_{i} represent the value of control input to node *i*. Then *J*_{i} is given by:

$$ J_{i} = \eta_{i} \sum_{k=1}^{m} B_{ik} f(u_{k}) $$

(3)

Where \(f(u_{k}):\mathbb {R}\longrightarrow \mathbb {R}^{+}\) is some increasing function. Nodes do not know, or have uniform priors over the value of *u*_{k}, since they are not aware, ahead of time, to which state the outside party will desire to control them. In other words, to maximize their incoming utility from corruption, nodes can only alter their position in the network to change the probability *B*_{i} that they are selected as a driver node; their incentive to do so varies with their corruptibility *η*_{i}.

Finally, it is assumed that corrupting parties who wish to control the network will select a minimum driver set so that *m*=*N*_{D}, and that the prior expectations of the agents over which minimum driver set will be selected by an external corrupting party are uniform. This means that the set of corrupt actors which are chosen to control the network will be of the minimum size possible. It also guarantees that the probability \(B_{i} = \sum _{k=1}^{N_{D}} \mathbb {E}[B_{ik}]\) that a node is selected as a driver node is equivalent to the *control capacity* (Jia and Barabási 2013) of that node.

Since exactly one of the *B*_{ik} will take the value 1 if the node is selected as a driver (with probability *B*_{i}) and they will all take the value 0 otherwise, it follows that the sum of nodes’ control capacities is always equal to *N*_{D}, the size of the minimum driver set. Thus as nodes make changes in linking patterns to raise their own control capacity it will increase *N*_{D}, all else constant. Therefore this assumption on preferences – specifically that nodes can only alter their utility from control input through the probability that they are selected as a driver – guarantees that a rise in the perception of corruption in a community should be coupled with a decline in controllability of the social networks that are formed within it. This is due to the phenomenon shown by Jia and Barabási (2013) that control capacity depends only on a nodes in-degree and is independent of its out-degree, and the subsequent reasoning that by removal of incoming influence, an individual can increase their control capacity without affecting their neighbors, effectively raising the size of the minimum driver set.

A sufficient, but not necessary, condition which leads to this to this result is for the control input values to be drawn from a random distribution, such as uniform. The only truly necessary condition, however, is that the magnitudes of realized control input values are conditionally independent of a nodes’ incoming linkage patterns, so that individuals can only change their expected utility from incoming input by changing their control capacity.

To summarize then, control capacity helps to uncover the hierarchy which is buried in the complexity of all network topologies. It states that, if you wish to control a network, these are the nodes which you *must* control. Naturally, if there is an increase in one node’s control capacity, ceteris parabis, then it must be the case the cardinality of the minimum node set required to control the network has increased, thus making the network system as a whole more difficult to control. It follows that when multiple nodes compete to increase their control capacity, controllability of the network will become more difficult. This novel effect is what this paper terms “hierarchical congestion”.

The assumption that corrupting parties choose from a uniform distribution over minimum driver sets (or, equivalently, that nodes have uniform prior expectations over the set of minimum driver sets as to which will be chosen) means that for each of the *m* controllers indexed by *k*, the probability of being selected for a controller is equal to the control capacity of the node, or, mathematically, that \(\forall k,l \in [1,m]\subset \mathbb {Z},\; \mathbb {E}[B_{ik}] = \mathbb {E}[B_{il}] = \frac {1}{m} B_{i}\) where *B*_{i} represents the control capacity of node *i*.

Thus the above assumptions are sufficient to guarantee that increases in the frequency of corruption should be coupled with increases in the relative size of *N*_{D}, making controllability of the network more difficult. This means that as the network becomes more corrupt (nodes expect a higher demand for control of the network and thus expect a larger payment in exchange for control), the network actually becomes *harder* to control (as measured by the size of the minimum driver set) due to rent-seeking at the individual level. This is the key result which will be tested in this paper.

The size of the minimum driver set required for exact controllability of the network can be found using the following method derived in Yuan et al. (2013). First, in order for the network to be fully controllable under the control scheme *B*, it must be the case that

$$ \text{rank}(cI_{n}-A,B)=n $$

(4)

For any constant *c*. This condition, called the Popov–Belevitch–Hautus (PBH) rank condition, is equivalent to the classic Kalman condition which states that the controllability matrix \(C=[B, AB, A^{2} B,\dots, A^{n-1}B]\) must be of full rank. Intuitively, these conditions are true when it is possible for a signal, input to the nodes in *B*, to reach every node in the network. Using the PBH formulation (4), the minimum number of nodes which must be controlled in order to fully control the network can be found by looking at the maximum geometric multiplicity *μ*(*λ*_{i}) of the eigenvalue *λ*_{i} of the adjacency matrix *A*.

$$ N_{D} = \max_{i} {\mu (\lambda_{i})} $$

(5)

This is proven by Yuan et al. (2013), and is evident by the observation that the first term in (4), *c**I*_{n}−*A*, is the matrix whose determinant forms the characteristic polynomial of the adjacency matrix *A*. Thus its minimum rank will occur at the point where *c*= arg max*i**μ*(*λ*_{i}).

We normalize the measure *N*_{D} by dividing it by the size of the network, in order to achieve a regressor which is independent of the size of the network. The result is the normalized controllability indicator \(n_{D}=\frac {N_{D}}{n}\). Another issue arises with using this metric to compare political networks, and this is the existence of multicameral legislatures in certain countries. Since political networks within a country should be expected to have similarly controllable features, the adjusted controllability measure is proposed as:

$$ \bar{n}_{D} = \frac{\sum_{c} N_{D}^{c}}{\sum_{c} n_{c}} $$

(6)

This is simply the weighted average of controllability across different chambers *c* of the legislature, each of size *n*_{c}. In other words, this adjusted controllability indicator shows what proportion of individuals in these networks must be controlled in order to have full control over the entire legislature.

### Formation

The goal of this paper is to show that concerns of individual nodes for their control capacity can have an emergent effect on the structure and controllability of the entire network. In this section, we examine the nodes’ ability to alter controllability using the simple heuristics of lowering their in-degree and avoiding triadic closure. The benefit of the analytical strategy proposed in this paper is that it allows for nodes to be decoupled from their individual linking and formation strategies and focuses instead on emergent indicators. It is important, however, to show that it is possible, at some level, for the individual preferences of nodes to have aggregate effects on the controllability of a network.

In order to accomplish this, we will introduce a simple model of social network formation games which can be microfounded and tied to node-level preferences using the method of Mele (2017), and see how it allows for nodes to heuristically manipulate their connectivity in order to increase their control capacity. In this model, nodes have utility:

$$\begin{array}{*{20}l} U_{i}(A;\theta) &= \sum_{j=1}^{n} A_{ij}l(\theta_{l}) + \sum_{i=1}^{n} A_{ij}A_{ji}r(\theta_{r}) \\&+ \sum_{j=1}^{n} A_{ij}\sum_{k=1\atop k\neq i,j}^{n} A_{jk}v(\theta_{v}) + \sum_{j=1}^{n} A_{ij}\sum_{k=1 \atop k\neq i,j}^{n} A_{ki}w(\theta_{w}) \end{array} $$

(7)

Where *l*, *r*, *v*, and *w* are bounded and real valued functions of their parameters. These functions represent direct linking benefits, reciprocity, indirect link benefits and popularity effects, respectively. Note that we have simplified the original model by assuming that nodes are homogenous in attributes, a decision which is motivated by both a desire for simplicity in the simulated environment as well as by the observation of Pósfai et al. (2013) that community structure has little to no effect on the controllability of the network. Under the assumptions laid out in Mele (2017)^{Footnote 3} the game is a potential game (Monderer and Shapley 1996) and there exists a potential function:

$$ Q(A; \theta)=\sum_{i=1}^{n} \sum_{j=1}^{n} A_{i j} l_{i j}\left(\theta_{l}\right)+\sum_{i=1}^{n} \sum_{j>i}^{n} A_{i j} A_{j i} r_{i j}\left(\theta_{r} \right)+\sum_{i=1}^{n} \sum_{j=1}^{n} \sum_{k=1 \atop j \neq i}^{n} A_{i j} A_{j k} v_{i k}\left(\theta_{v}\right) $$

(8)

With the property that a unilateral deviation in strategy has the same effect on an individual node’s utility as it does on this potential function. To briefly summarize, the assumptions required for this to be true involve the symmetry of *m* across players and that of *v* with *w*. Most of the strength of this assumption is eliminated by our assumption of homogeneity in nodal attributes. Due to the existence of this potential function, and under the further assumptions of a random meeting process that is uncorrelated with link existence, and the bounded rationality of agents modeled as an idiosyncratic shock to preferences following a Gumbel distribution, the formation game evolves as a Markov chain and has the property that it will converge to a unique stationary distribution

$$ \pi(A ; \theta)=\frac{\exp [Q(A; \theta)]}{\sum_{\alpha \in \mathcal{A}} \exp [Q(\alpha ; \theta)]} $$

(9)

If the potential function *Q* is linear in parameters, (and nodes are homogenous in attributes), this distribution can be written as

$$ \pi(A; \theta)=\frac{\exp [\theta't(A))]}{\sum_{\alpha \in \mathcal{A}} \exp [\theta't(\alpha)]} $$

(10)

This distribution belongs to an exponential family of random graph distributions (ERGM) (Desmarais and Cranmer 2012) and can be estimated or simulated using the ERGM package in R (Hunter et al. 2008; R Core Team 2019). Note that under linearity in parameters, the three terms in the potential function give link frequency, number of mutual links, and triadic closure, respectively. The parameters associated with these network features in the ERGM regression are *θ*_{l}, *θ*_{r}, and *θ*_{v}. Issues with asymptotics of the model for large network estimation and methods for Markov-Chain Monte-Carlo simulation of the intractable normalizing constant in this model are covered in Mele (2017). For each simulated network, structural controllability is computed using the method of Liu et al. (2011). This method is used, as opposed to the exact controllability of (Yuan et al. 2013), because the networks drawn using this simple version of the ERGM are unweighted, which could make them harder to control than real-valued weighted networks. In the networks which will be analyzed in the next section, entries are real-valued, so their exact controllability should be equal to the structural controllability. Indeed, it is shown by Liu et al. (2011) that the space of networks for which structural controllability is not equal to exact controllability is of measure zero and occurs in networks which are not sufficiently asymmetric or non-normal^{Footnote 4}. Structural controllability is calculated by computing a maximal bipartite matching of the network and counting the cardinality of the set of unmatched nodes. For the purposes of the simulation this is done using Octave (Eaton et al. 2017).