Open Access

The global dynamical complexity of the human brain network

Applied Network Science20161:16

DOI: 10.1007/s41109-016-0018-8

Received: 16 July 2016

Accepted: 24 November 2016

Published: 30 December 2016

Abstract

How much information do large brain networks integrate as a whole over the sum of their parts? Can the dynamical complexity of such networks be globally quantified in an information-theoretic way and be meaningfully coupled to brain function? Recently, measures of dynamical complexity such as integrated information have been proposed. However, problems related to the normalization and Bell number of partitions associated to these measures make these approaches computationally infeasible for large-scale brain networks. Our goal in this work is to address this problem. Our formulation of network integrated information is based on the Kullback-Leibler divergence between the multivariate distribution on the set of network states versus the corresponding factorized distribution over its parts. We find that implementing the maximum information partition optimizes computations. These methods are well-suited for large networks with linear stochastic dynamics. We compute the integrated information for both, the system’s attractor states, as well as non-stationary dynamical states of the network. We then apply this formalism to brain networks to compute the integrated information for the human brain’s connectome. Compared to a randomly re-wired network, we find that the specific topology of the brain generates greater information complexity.

Keywords

Brain networks Neural dynamics Complexity measures

Introduction

From a computational neuroscience perspective, the brain is oftentimes abstracted as a complex information processing network, that integrates sensory inputs from multiple modalities in order to generate action and cognition. In this paper, we ask a much simpler question: viewing the brain as a dynamical network of neural masses, how can one compute the information integrated by such networks in the course of dynamical transitions from one state to another? A possible approach, among others, is to look at information-theoretic complexity measures that seek to quantify information generated by all causal sub-processes in such a network. One candidate measure for global dynamical complexity is integrated information, usually denoted as Φ. It was first introduced in neuroscience as a complexity measure for neural networks, and by extension, as a possible correlate of consciousness itself (Tononi et al. 1994). It is defined as the quantity of information generated by a network as a whole, due to its causal dynamical interactions, and one that is over and above the information generated independently by the disjoint sum of its parts. As a complexity measure, Φ seeks to operationalize the intuition that complexity arises from simultaneous integration and differentiation of the network’s structural and dynamical properties. As such, the interplay of integration and differentiation in a network’s dynamics is hypothesized to generate information that is highly diversified yet integrated, thereby creating patterns of high complexity. The aim of this paper is to develop mathematical tools for computing integrated information (analytically when possible, otherwise numerically) for large networks. We then apply this framework to the large-scale structural connectivity network of the human brain.

Let us begin with a brief review of the rich history of this field. The earliest proposals defining integrated information were made in the pioneering work of (Tononi 2004; Tononi and Sporns 2003; Tononi et al. 1994). Since then, considerable progress has been made towards development of a normative theory as well as applications of integrated information (Arsiwalla and Verschure 2013; 2016a; Balduzzi and Tononi 2008; 2009; Barrett and Seth 2011; Krohn and Ostwald 2016; Mediano et al. 2016; Oizumi et al. 2014; Tononi 2012). Similar information-based approaches have also been successfully applied to many-body problems in other domains, such as, for the problem of estimating microstates of statistical mechanical ensembles (Arsiwalla 2009). In fact, there are now several candidate measures of integrated information such as neural complexity (Tononi et al. 1994), causal density (Seth 2005), Φ from integrated information theory: IIT 1.0, 2.0 & 3.0 (Balduzzi and Tononi 2008; Oizumi et al. 2014; Tononi 2004), stochastic interaction (Ay 2015; Wennekers and Ay 2005), empirical Φ (Barrett and Seth 2011) and synergistic Φ (Griffith 2014; Griffith and Koch 2014), plus several variations of these (see Tegmark (2016) for an overview). Table 1 summarizes these measures along with corresponding information metrics upon which they have been based.
Table 1

Candidate measures of integrated information shown alongside information metrics used in their respective formulations

Φ Measures

Information Metrics

Neural Complexity

Mutual Information

Causal Density

Granger Causality

Stochastic Interaction

Kullback-Leibler Divergence

IIT 1.0 & 2.0

Kullback-Leibler Divergence

Empirical Φ

Mutual Information

IIT 3.0

Earth Mover’s Distance

Synergistic Φ

Synergistic Information

Many of the above measures have been useful in different domains of validity. However, applications to realistic data and in particular to large-scale networks have proven computationally challenging. With a focus on developing computational tools, we discuss three of the above measures in more detail. The measure of (Balduzzi and Tononi 2008) has been quite useful for discrete-state, deterministic, Markovian systems with the maximum entropy distribution. On the other hand, the measure of (Barrett and Seth 2011) has been applied to continuous-state, stochastic, non-Markovian systems and in principle, admits dynamics with any empirical distribution (although in practice, it is easier to use assuming Gaussian distributions). The formulation in (Barrett and Seth 2011) is based on mutual information, whereas (Balduzzi and Tononi 2008) uses a measure based on the Kullback-Leibler divergence. Note however, that in some cases the measure of (Barrett and Seth 2011) can take negative values and that complicates its interpretation. The Kullback-Leibler based definition computes the information generated during state transitions and as we shall see remains positive in the regime of stable dynamics. This makes it easier to interpret as an integrated information measure. Both measures (Balduzzi and Tononi 2008; Barrett and Seth 2011) make use of a normalization scheme in their formulations. Normalization inadvertently introduces ambiguities in computations. The normalization is actually used for the purpose of determining the partition of the network that minimizes the integrated information, but a normalization dependent choice of partition ends up influencing the value and interpretation of Φ. An alternate measure based on the Earth Mover’s distance was proposed in (Oizumi et al. 2014). This does away with the normalization problem (though the current version is not formulated for continuous-state variables). However, the formulation of (Oizumi et al. 2014) lies outside the scope of standard information theory and is still very difficult for performing computations on large networks.

A comment on network partitions is relevant at this point. The three measures of Φ discussed above, make use of what is called the minimum information partition or minimum information bi-partition (MIP/MIB). The issue with this partitioning is that it leads to a combinatorial explosion in the number of configurations to be evaluated when working with networks having a large number of nodes. As a result, application of the above measures to compute integrated information of very large networks remains challenging, particularly for the scale of networks obtained from neuroimaging data. On the other hand, in earlier work (Arsiwalla and Verschure 2013), we have introduced a formulation of integrated information that overcomes both, the normalization and combinatorial problem by using a different partitioning of the network called the maximum information partition (MaxIP), which opens the prospect of large-scale applications. However, the formulation in (Arsiwalla and Verschure 2013) was only applicable for uncorrelated node dynamics, which may not be realistic enough for many biological systems.

In this paper, we seek to go beyond (Arsiwalla and Verschure 2013; 2016a; 2016b), starting with an extension of the formalism to include node correlations and also non-stationarity. In order to do that, we solve the discrete-time Lyapunov equation, the solution of which, is then used to get fully analytic expressions for Φ with network correlations. We consider networks with linear stochastic dynamics, which generate multivariate time-series signals. Furthermore, our networks are plastic, in the sense that connection weights are scalable using a global coupling parameter. We compute Φ as a function of this coupling. We also extend our framework to include non-stationary dynamics. This gives us Φ as a function of time, computed through the temporal evolution of the system. The stationary solution yields Φ at the fixed-point attractor, whereas the non-stationary solution leads to Φ elsewhere in the phase space of the system.

As proof of principle, we apply our formulation to the structural connectivity network of white matter fiber tracts in the human cerebral cortex, obtained from diffusion spectrum imaging (Hagmann et al. 2008; Honey et al. 2009). This network has 998 nodes, representing neuronal populations. The edges are weighted fiber counts between populations. Implementing stochastic Gaussian dynamics on this network, we determine stationary solutions to the dynamical system from which we compute the information integrated in bits. To contrast with a null-model, we randomly re-wire the original network and repeat the computation. The original network scores higher on integrated information for all allowed couplings in the stationary as well as non-stationary regime.

Stochastic integrated information

Mathematical formulation

We consider networks with linear stochastic dynamics. The state of each node is given by a random variable pertaining to a given probability distribution. These variables may either be discrete-valued or continuous. However, for many biological applications, Gaussian distributed, continuous-valued state variables are fairly reasonable abstractions (for example, aggregate neural population firing rate, EEG or fMRI signals). The state of the network X t at time t is taken as a multivariate Gaussian variable with distribution \(\phantom {\dot {i}\!}\mathbf {P}_{\mathbf {X}_{\mathbf {t}}} (\mathbf {x}_{\mathbf {t}}) \). x t denotes an instantiation of X t with components \({{x_{t}^{i}}}\) (i going from 1 to n, n being the number of nodes). When the network makes a transition from an initial state X 0 to a state X 1 at time t=1, observing the final state generates information about the system’s initial state. The information generated equals the reduction in uncertainty regarding the initial state X 0 . This is given by the conditional entropy H(X 0 |X 1 ). In order to extract that part of the information generated by the system as a whole, over and above that generated individually by its parts, one computes the relative conditional entropy given by the Kullback-Leibler divergence of the conditional distribution \(\mathbf {P}_{\mathbf {X}_{\mathbf {0}} | \mathbf {X}_{\mathbf {1}} = \mathbf {x}^{\prime }} (x) \) of the system with respect to the joint conditional distributions \(\prod _{k=1}^{r} \mathbf {P}_{\mathbf {M}^{\mathbf {k}}_{\mathbf {0}} | {\mathbf {M}^{\mathbf {k}}_{\mathbf {1}} = \mathbf {m}^{\prime }}} \) of its non-overlapping sub-systems demarcated with respect to a partition \({\mathcal {P}}_{r}\) of the system into r distinct sub-systems. Denoting this as \({\Phi _{{\mathcal {P}}_{r}}}\), we have
$$\begin{array}{@{}rcl@{}} {\Phi_{\mathcal{P}_{r}}} \left(\mathbf{X}_{\mathbf{0}} \rightarrow \mathbf{X}_{\mathbf{1}} = \mathbf{x}^{\prime}\right) = \, D_{KL} \left({\mathbf{P}_{{\mathbf{X}}_{\mathbf{0}} | \mathbf{X}_{\mathbf{1}} = \mathbf{x}^{\prime}}} \left|{\vphantom{\mathbf{P}_{{\mathbf{X}}_{\mathbf{0}}}}}\right| \prod\limits_{k=1}^{r} {\mathbf{P}_{{\mathbf{M}^{\mathbf{k}}_{\mathbf{0}}} | {\mathbf{M}^{\mathbf{k}}_{\mathbf{1}}} = \mathbf{m}^{\prime}}} \right) \end{array} $$
(1)
where for an r partitioned system, the state variable X 0 can be decomposed as a direct sum of state variables of the sub-systems
$$\begin{array}{@{}rcl@{}} {\mathbf{X}_{\mathbf{0}} = {\mathbf{M}_{\mathbf{0}}^{\mathbf{1}}} \oplus {\mathbf{M}_{\mathbf{0}}^{\mathbf{2}}} \oplus \cdots \oplus {\mathbf{M}_{\mathbf{0}}^{\mathbf{r}}} = \bigoplus_{\mathbf{k} = \mathbf{1}}^{\mathbf{r}} {\mathbf{M}_{\mathbf{0}}^{\mathbf{k}}} } \end{array} $$
(2)
and similarly, X 1 decomposes as
$$\begin{array}{@{}rcl@{}} {\mathbf{X}_{\mathbf{1}} = {\mathbf{M}_{\mathbf{1}}^{\mathbf{1}}} \oplus {\mathbf{M}_{\mathbf{1}}^{\mathbf{2}}} \oplus \cdots \oplus {\mathbf{M}_{\mathbf{1}}^{\mathbf{r}}} = \bigoplus_{\mathbf{k} = \mathbf{1}}^{\mathbf{r}} {\mathbf{M}_{\mathbf{1}}^{\mathbf{k}}} } \end{array} $$
(3)
For stochastic systems, it is useful to work with a measure that is independent of any specific instantiation of the final state x . So we average with respect to final states to obtain an expectation value from Eq. (1). After some algebra, we get
$$ \left< \Phi \right>_{\mathcal{P}_{r}} ({\mathbf{X}_{\mathbf{0}} \rightarrow \mathbf{X}_{\mathbf{1}}}) = - {\mathbf{H} (\mathbf{X}_{\mathbf{0}} | \mathbf{X}_{\mathbf{1}})} + \sum\limits_{k=1}^{r} {\mathbf{H} \left({\mathbf{M}^{\mathbf{k}}_{\mathbf{0}}} | {\mathbf{M}^{\mathbf{k}}_{\mathbf{1}}}\right) } $$
(4)

This is our definition of integrated information, which we use in the rest of this paper. Note that the measure described in (Balduzzi and Tononi 2008) is not applicable to networks with stochastic dynamics. They do use Eq. (1) as their definition but endow their nodes with discrete states. On the other hand, (Barrett and Seth 2011) uses a different definition of integrated information, where conditional entropies as in Eq. (4) are replaced by conditional mutual information. This definition only matches the definition of Eq. (1) in special cases but not in general for any distribution. From an information theory perspective, the Kullback-Leibler divergence offers a principled way of comparing probability distributions, hence we follow that approach in formulating our measure in Eq. (4).

The state variable at each time t=0 and t=1 follows a multivariate Gaussian distribution
$$ {\mathbf{X}_{\mathbf{0}} \sim \mathcal{N} \left(\bar{\mathbf{x}}_{\mathbf{0}}, \boldsymbol{\Sigma} (\mathbf{X}_{\mathbf{0}})\right) } \qquad {\mathbf{X}_{\mathbf{1}} \sim \mathcal{N}} \left({\bar{\mathbf{x}}_{\mathbf{1}}, \boldsymbol{\Sigma} (\mathbf{X}_{\mathbf{1}})} \right) $$
(5)
The generative model for this system is equivalent to a multi-variate auto-regressive process (Barrett et al. 2010)
$$ {\mathbf{X}_{\mathbf{1}} = \mathcal{A} \; \mathbf{X}_{\mathbf{0}} + \mathbf{E}_{\mathbf{1}} } $$
(6)
where \(\mathcal {A}\) is the weighted adjacency matrix of the network and E 1 is Gaussian noise. Next, taking the mean and covariance respectively on both sides of this equation, while holding the residual independent of the regression variables, yields
$$\begin{array}{@{}rcl@{}} {\bar{\mathbf{x}}_{\mathbf{1}} = \mathcal{A} \; \bar{\mathbf{x}}_{\mathbf{0}} } \quad \qquad {\boldsymbol{\Sigma}(\mathbf{X}_{\mathbf{1}}) = \mathcal{A} \; \boldsymbol{\Sigma}(\mathbf{X}_{\mathbf{0}}) \; \mathcal{A}^{\mathbf{T}} + \boldsymbol{\Sigma}(\mathbf{E}) } \end{array} $$
(7)

In the absence of any external inputs, stationary solutions of a stochastic linear dynamical system as in Eq. (6) are fluctuations about the origin. Therefore, we can shift coordinates to set the means \({\bar {\mathbf {x}}_{\mathbf {0}}}\) and consequently \(\bar {\mathbf {x}}_{\mathbf {1}}\) to the zero. The second equality in Eq. (7) is the discrete-time Lyapunov equation and its solution will give us the covariance matrix of the state variables.

The conditional entropy for a multivariate Gaussian variable was computed in (Barrett and Seth 2011)
$$ {\mathbf{H} (\mathbf{X}_{\mathbf{0}} | \mathbf{X}_{\mathbf{1}})} = \frac{1}{2} n \log (2 \pi e) - \frac{1}{2} \log \left[ \det {\boldsymbol{\Sigma} (\mathbf{X}_{\mathbf{0}} | \mathbf{X}_{\mathbf{1}})} \right] $$
(8)
which is fully specified by the conditional covariance matrix. Inserting this in Eq. (4) yields
$$ \left< \Phi \right>_{\mathcal{P}_{r}} ({\mathbf{X}_{\mathbf{0}} \rightarrow \mathbf{X}_{\mathbf{1}}}) = \frac{1}{2} \log \left[ \frac{\prod_{\mathbf{k} = 1}^{r} \det {\boldsymbol{\Sigma} \left({\mathbf{M}^{\mathbf{k}}_{\mathbf{0}}} | {\mathbf{M}^{\mathbf{k}}_{\mathbf{1}}}\right)} }{\det {\boldsymbol{\Sigma} (\mathbf{X}_{\mathbf{0}} | \mathbf{X}_{\mathbf{1}})} } \right] $$
(9)
Now, in order to compute the conditional covariance matrix we make use of the identity (proof of this identity for the Gaussian case was demonstrated in (Barrett et al. 2010))
$$ {\boldsymbol{\Sigma} (\mathbf{X} | \mathbf{Y}) = \boldsymbol{\Sigma}(\mathbf{X}) - \boldsymbol{\Sigma} (\mathbf{X}, \mathbf{Y}) \boldsymbol{\Sigma} (\mathbf{Y})^{-\mathbf{1}} \boldsymbol{\Sigma} (\mathbf{X}, \mathbf{Y})^{\mathbf{T}} } $$
(10)
The appropriate covariance we will need to insert in this expression is
$$ {\boldsymbol{\Sigma} (\mathbf{X}_{\mathbf{0}}, \mathbf{X}_{\mathbf{1}}) \equiv \left< \left(\mathbf{X}_{\mathbf{0}} - \bar{\mathbf{x}}_{\mathbf{0}} \right) \left(\mathbf{X}_{\mathbf{1}} - \bar{\mathbf{x}}_{\mathbf{1}} \right)^{\mathbf{T}} \right> = \boldsymbol{\Sigma} (\mathbf{X}_{\mathbf{0}}) \, \mathcal{A}^{\mathbf{T}} } $$
(11)
which gives for the conditional covariance
$$\begin{array}{@{}rcl@{}} {\boldsymbol{\Sigma} \left(\mathbf{X}_{\mathbf{0}} | \mathbf{X}_{\mathbf{1}}\right) = \boldsymbol{\Sigma}\left(\mathbf{X}_{\mathbf{0}}\right) - \boldsymbol{\Sigma} (\mathbf{X}_{\mathbf{0}}) \, \mathcal{A}^{\mathbf{T}} \, \boldsymbol{\Sigma} (\mathbf{X}_{\mathbf{1}})^{-\mathbf{1}} \mathcal{A} \; \Sigma (\mathbf{X}_{\mathbf{0}})^{\mathbf{T}} } \end{array} $$
(12)
And similarly for the sub-systems
$$\begin{array}{@{}rcl@{}} {\boldsymbol{\Sigma} \left({\mathbf{M}^{\mathbf{k}}_{\mathbf{0}}} | {\mathbf{M}^{\mathbf{k}}_{\mathbf{1}}}\right)} = {\boldsymbol{\Sigma}\left({\mathbf{M}_{\mathbf{0}}^{\mathbf{k}}}\right)} - {\boldsymbol{\Sigma}\left({\mathbf{M}_{\mathbf{0}}^{\mathbf{k}}}\right) \, {\mathcal{A}^{\mathbf{T}}} \big{|}_{\mathbf{k}} \, { \boldsymbol{\Sigma}\left({\mathbf{M}_{\mathbf{1}}^{\mathbf{k}}}\right)}^{-\mathbf{1}} \mathcal{A} \big{|}_{\mathbf{k}} \, {\boldsymbol{\Sigma} \left({\mathbf{M}_{\mathbf{0}}^{\mathbf{k}}}\right)}^{\mathbf{T}}} \end{array} $$
(13)

where k indexes the partition such that \(\mathbf {{M_{0}^{k}}}\) denotes the k t h sub-system at t=0 and \( \mathcal {A} \big {|}_{k}\) denotes the restriction of the adjacency matrix to the k t h sub-network.

Further, for linear multi-variate systems, a unique fixed point always exists. We try to find stable stationary solutions of the dynamical system. In that regime, the multi-variate probability distribution of states approaches stationarity and the covariance matrix converges, such that
$$\begin{array}{@{}rcl@{}} {\boldsymbol{\Sigma} (\mathbf{X}_{\mathbf{1}}) = \boldsymbol{\Sigma} (\mathbf{X}_{\mathbf{0}})} \end{array} $$
(14)
t=0 and t=1 refer to time-points taken after the system converges to the fixed point. Then the discrete-time Lyapunov equations can be solved iteratively for the stable covariance matrix Σ(X t ). For networks with symmetric adjacency matrix and independent Gaussian noise, the solution takes a particularly simple form
$$\begin{array}{@{}rcl@{}} {\boldsymbol{\Sigma} (\mathbf{X}_{\mathbf{t}}) = \left(\mathbf{1} - \mathcal{A}^{\mathbf{2}} \right)^{-\mathbf{1}} \boldsymbol{\Sigma}(\mathbf{E}) } \end{array} $$
(15)
and for the parts, we have
$$\begin{array}{@{}rcl@{}} {\boldsymbol{\Sigma}({\mathbf{M}_{\mathbf{0}}^{\mathbf{k}}}) = \boldsymbol{\Sigma} (\mathbf{X}_{\mathbf{0}}) \big{|}_{\mathbf{k}} } \end{array} $$
(16)

given by the restriction of the full covariance matrix on the k t h sub-network. Note that Eq. (16) is not the same as Eq. (15) on the restricted adjacency matrix as that would mean that the sub-network has been explicitly severed from the rest of the system. Indeed, Eq. (16) is precisely the covariance of the sub-network while it is still part of the network and <Φ> yields the integrated and differentiated information of the whole network that is greater than the sum of these connected parts. Inserting Eqs. (12), (13), (15) and (16) into Eq. (9) yields <Φ> as a function of network weights for symmetric and correlated networks. For the case of asymmetric weights, the entries of the covariance matrix cannot be explicitly expressed as a matrix equation. However, they may still be solved by Jordan decomposition of both sides of the Lyapunov equation.

The maximum information partition

Following (Arsiwalla and Verschure 2013; Edlund et al. 2011), the maximum information partition (MaxIP) is defined as the partition of the system into its irreducible parts. This is the finest partition and is unique as there is only one way to combinatorially reduce a system into all of its sub-units. This partition can directly be found by construction and does not require a normalization scheme for sampling through the space of multi-partitions in order to search for the one that either maximizes or minimizes the integrated information. Consequently, the resulting value of <Φ> computed using the MaxIP is free from normalization dependencies.

Moreover, the MaxIP also helps reduce computational cost. This can be seen as follows. Prescriptions using the MIP/MIB are typically evaluated for a large class of network bi-partitions, whereas the MaxIP is uniquely defined. The number of bi-partitions of a set of n elements is given by the sum of binomial coefficients \(\sum _{p = 1}^{[n/2]} \,^{n}C_{p}\), where n C p =n!/p! (np)! with n!=n×(n−1)××1 and [n/2] denotes the nearest integer less that or equal to n/2. Among all possible bi-partitions, MIP/MIB prescriptions usually restrict to those that divide the system into approximately equal parts. This still leaves us with n C [n/2] configurations for which <Φ> has to be computed. Table 2 summarizes how this number scales with network size from a single node to a million nodes.
Table 2

Scaling of network configurations upon computing Φ using the MIP/MIB versus using the MaxIP for networks with n nodes

No. of nodes n

No. of equal part bi-partitions = n C [n/2]

No. of MaxIPs = n C n

1

1

1

10

252

1

100

1.01 ×1029

1

1000

2.70 ×10299

1

1000000

7.90 ×10301026

1

Another interesting feature of the MaxIP is that <Φ> computed using this partition in fact accounts for the maximum amount of information that the network can integrate compared to any other bi-, tri- or multi-partition of the system. This is due to the fact that this partition cannot be decomposed further. Every other partition will be coarser than the MaxIP and will therefore have at least some of its parts as composites of the irreducible units in the MaxIP. As these composites integrate more information than its own irreducible units, subtracting the information of a composite (when treating the composite as a part) from the information of the whole system will always produce a smaller <Φ> than that obtained by subtracting the information of each irreducible unit of the network from that of the whole network. Therefore <Φ> computed using the MaxIP is the maximum possible integrated information of the system compared to <Φ> computed using any other partition of the network. In that sense, unlike the MIP or MIB, the MaxIP in fact captures the complete information integrated by the network and is therefore a more natural choice for quantifying whole versus parts.

Analytic solutions for <Φ>

Now that we have a rigorous analytic formulation of integrated information, let us first demonstrate examples of computations performed using artificial networks. In Fig. 1 we consider two artificial networks. For these cases, we want to compute the exact analytic solution for <Φ>. Each of these networks have 8 dimensional adjacency matrices with bi-directional weights (though our analysis does not depend on that and works as well with directed graphs). We want to compute <Φ> as a function of network weights, which we keep as free parameters. However, in order to constrain the space of parameters, we shall set all weights to a single parameter, the global coupling strength g. This gives us <Φ> as an analytic function of g.
Fig. 1

Graphs of two artificial networks, (A) and (B)

<Φ> for attractor states

We first compute <Φ> in the stationary regime, that is, when the system has converged to its fixed-point attractor state. The results for the two networks labeled a and b respectively are shown in Eqs.(17), (18) respectively. These are computed for a single time-step, corresponding to the stable stationary solution of the system.
$$\begin{array}{@{}rcl@{}} \left< \Phi \right>_{A} &=& \frac{1}{2} \log \frac{\left(1-43 g^{2} \right)^{8} }{ \left(1-50 g^{2}+49 g^{4} \right)^{8}} \end{array} $$
(17)
$$\begin{array}{@{}rcl@{}} \left< \Phi \right>_{B} &=& \frac{1}{2} \log \frac{B_{1} \cdot B_{2} \cdot B_{3} \cdot B_{4} \cdot B_{5}}{\left(-1+g^{2} \right)^{4} \left(1-8 g^{2}+4 g^{4} \right)^{6} \left(1-17 g^{2}+72 g^{4}-64 g^{6}+16 g^{8} \right)^{8}} \end{array} $$
(18)
where
$$\begin{array}{@{}rcl@{}} B_{1} &=& \left(1-15 g^{2}+56 g^{4}-56 g^{6}+16 g^{8} \right) \\ B_{2} &=& \left(1-15 g^{2}+54 g^{4}-54 g^{6}+16 g^{8} \right) \\ B_{3} &=& \left(1-22 g^{2}+159 g^{4}-426 g^{6}+336 g^{8}-80 g^{10} \right)^{2} \\ B_{4} &=& \left(1-21 g^{2}+147 g^{4}-401 g^{6}+374 g^{8}-136 g^{10}+16 g^{12} \right)^{2} \\ B_{5} &=& \left(1-23 g^{2}+183 g^{4}-612 g^{6}+835 g^{8}-526 g^{10}+152 g^{12}-16 g^{14} \right)^{2} \end{array} $$
(19)

Note that the mathematical framework described above is in no way limited by the size of the network and thus, in principle, can be applied to networks of any size, to yield exact results. The only practical difficulty would be in the form of available computing hardware resources. Hence, for very large data networks, such as those from brain imaging, numerical computations of <Φ> would be more practical to perform.

<Φ> for non-stationary dynamics

The mathematical formulation developed above can also be used to compute <Φ> at non-stationary points in the solution space of networks with linear stochastic dynamics. We show this explicitly for Network B in Fig. 1. We compute <Φ> for the complete temporal evolution of the system starting from an initial condition at t=0 until the system stabilizes at the fixed point attractor. In the non-stationary case, Eqs. (14), (15) and (16) no longer hold. However, everything up to and including Eq. (13) are valid. Hence, the covariance matrix Σ(X t ) is simply computed recursively following Eq. (7). Subsequently, Σ(X t ) and <Φ> are both computed for each time-point t.

Figure 2 shows the multivariate time-series signals generated by Network B for two different coupling strengths g. The critical value of g for this network is 0.3023, at which the dynamics becomes unstable. For g≤0.3023, the system converges to the fixed-point at the origin. In Fig. 3 we plot temporal profiles of <Φ> for both the above values of g, which shows increasing integrated information for stronger coupled networks.
Fig. 2

Simulated time-series data for Network B following the generative model in Eq. (6). The plot on the left shows simulated data corresponding to network coupling strength g=0.3000 and variance of noise σ=1. The plot on the right refers to data from the same network with coupling g=0.3022 and the same noise amplitude. Each plot shows 8 time-series profiles, corresponding to the 8 nodes of the network (note that several of these profiles intersect or overlap with each other, hence in the above plots they appear to be clustered together). The time-series for each node is shown in a different color and the color scheme is the same for the plot on the left and the one on the right. Stability of the system is guaranteed until the critical coupling at g=0.3023. Closer to the critical point, the system takes longer to converge to the fixed-point attractor at x = 0

Fig. 3

Temporal profiles of <Φ> for Network B corresponding to the two coupling strengths used in Fig. 2. <Φ> saturates as the system approaches the stable attractor with greater integrated information for dynamics closer to the critical point

Application to brain connectomics

The framework described above, provides us with all the mathematical tools to compute how much information is integrated in bits in a single time-step, by a large network with linear stochastic (Gaussian) dynamics. We apply the above formulation to the whole brain structural connectivity network of the human cerebral cortex, using data published in the seminal work of (Hagmann et al. 2008; Honey et al. 2009). This data is acquired from high-resolution T1-weighted diffusion spectrum imaging (DSI). The data preprocessing pipeline, as described in (Hagmann et al. 2008), involves white and gray matter segmentation from the T1 images, followed by parcellation into 66 anatomical regions and subsequently 998 individual regions of interest (ROIs) based on Talairach coordinates. After that, whole brain tractography is performed to obtain estimates of axonal trajectories across the entire white matter. From this, connection weights between pairs of ROIs are determined, resulting in a weighted network of structural connectivity across the entire brain. We have displayed the data in matrix form as a 998 dimensional matrix on the left-hand side of Fig. 4. The 998 voxels (ROIs) represent nodes of the network. Each node is physically a population of neurons. The edges are weighted fiber counts between populations. Additionally, we include a global coupling variable g, multiplying the entire matrix, that can be used to tune the overall strength of the weights.
Fig. 4

Left: Connectivity matrix of human cerebral white matter. Right: Randomized version of the same matrix, preserving network weights. The data consists of white matter fiber tracts from 998 cortical voxels. The connectivity matrix on the left is a weighted matrix with the color-bar (in the middle) indicating connection strengths. The randomized matrix on the right is obtained by randomly shuffling positions of weights from the connectivity matrix

To simulate brain dynamics, one may chose from among a variety of possible models, discussed in (Arsiwalla et al. 2013, 2015a,b; Galán 2008). To run these simulations, one may use customizable tools such as those described in (Betella et al. 2014a,b; 2013; Omedas et al. 2014). The simplest model among the ones mentioned above is the linear stochastic Wilson-Cowen model. In fact, it can be seen from (Galán 2008) that Eq. (6) is precisely a special case of the discrete-time limit of the linear stochastic Wilson-Cowen model. That is what we use here. The brain’s state of spontaneous activity or resting-state is usually identified as the attractor state of these models. This corresponds to finding stable stationary solutions of the system. This is precisely the regime in which we compute <Φ> in bits as a function of the coupling g. The results are shown in the red profile in Fig. 5. Further, in order to contrast this result with a null model, we also rewired the edges of the connectome network randomly, while preserving the magnitude of the weights. This generates the randomized data matrix shown on the right-hand side of Fig. 4. We also compute <Φ> for this matrix. The resulting profile is the blue curve in Fig. 5. For extremely small couplings, the two networks are indistinguishable on <Φ> scores, however, as g grows, the architecture of the brain’s network turns out to perform better at integrating information than its randomized counterpart.
Fig. 5

<Φ> as a function of global coupling strength g. <Φ> for the data (shown as red points) and for the randomized network (shown as blue points). Stationary solutions exist up to g = 1.49, the critical point of the data network

While Fig. 5 depicts the fixed-point behavior of <Φ> as a function of g, in Fig. 6 we show the full time-course of <Φ> for both the connectome network as well as its randomized counterpart at a specific value of the coupling g=1.488. The non-stationary behavior is computed using linearized dynamics as discussed above. Asymptotic values of <Φ> in Fig. 6 converge to those in Fig. 5 at g=1.488. Once again we find that the connectomic network completely dominates its randomized counterpart in the quantity of information it integrates and this difference only gets more pronounced upon approaching the attractor state. Note that for a more thorough comparison, one might also want to check the above against an entire distribution of random networks. However, the main point of this paper is to demonstrate a systematic computation of how much information a realistic large network integrates. Functionally, what this corresponds to in terms of brain function or disease is an interesting question by itself. A possible approach towards addressing those issues would be to perform computations as the one demonstrated above for a large repertoire of neuroimaging studies ranging from task-based paradigms to disease states and use that to calibrate brain functional states on a scale of information complexity. Another question on which there is still no consensus concerns consciousness (Arsiwalla et al. 2016a). While it is generally agreed that information integration is a necessary component of phenomenological consciousness, by itself, it may not be sufficient (Arsiwalla et al. 2016b, Verschure 2016).
Fig. 6

A comparison of temporal profiles of <Φ> for the brain connectome network versus its randomized counterpart, both computed at a fixed coupling g=1.488. The asymptotes of these profiles match the stationary values of <Φ> in Fig. 5 for the given coupling

Discussion

In this paper, we have demonstrated a computational framework, built on a rich body of earlier work on information-theoretic complexity measures and applied it to compute the integrated information of large networks endowed with linear stochastic dynamics. Integrated information is interesting as a global measure of a system’s dynamical complexity. Whereas local complexity measures such as Granger causality, transfer entropy or synergistic mutual information have been very successful at quantifying local information processes of complex systems (Wibral et al. 2014), global measures such as integrated information serve to complement local measures and give insights on the system’s collective behavior.

Earlier attempts to compute integrated information have been limited to relatively small networks. This was mainly due to normalization ambiguities and explosive combinatorics associated with bi-partitions used therein. Instead, what we find is that the finest partitioning of the system solves all these problems and opens the window of applicability to large-scale networks. In particular, we apply our formulation to the human brain connectome network. This network is constructed from white matter tractography data from the human cerebral cortex and consists of 998 nodes with about 28,000 symmetric and weighted connections between them (Hagmann et al. 2008; Honey et al. 2009). Using a discrete-time linear stochastic neuronal population model to generate the dynamics of neural activity on this network, we compute the integrated information of this dynamical system during state transitions for both, stationary as well as non-stationary dynamics. For the linearized system, the stationary solution corresponds to the network’s resting state attractor. The computed integrated information depends on both, the structural anatomy as well as the network’s dynamical operating point, that is, the value of the global coupling g.

We see potentially useful applications of our information-based measures for other types of physiological data as well, for example, tracing studies or detailed microscopic connectivity data. As for neuroimaging studies, information-based methods offer a useful way to quantify complexity of brain functions. The clinical utility of our measure would be in identifying information-based differences between healthy subjects and patients of neurodegenerative diseases. Just as we identified a transitionary phase after which an anatomical network strongly differs in information integration and differentiation from a randomly rewired network, similar comparative analysis for patients compared to healthy controls might provide a quantification of the extent of the disorder and even provide an analytic way to suggest diagnostic surgical rewiring to restore network processing.

Declarations

Acknowledgements

This work has been supported by the European Research Council’s CDAC project: “The Role of Consciousness in Adaptive Behavior: A Combined Empirical, Computational and Robot based Approach” (ERC-2013- ADG 341196).

Authors’ contributions

All authors contributed to this work. Both authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests. Both authors read and approved the final manuscript.

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors’ Affiliations

(1)
Synthetic Perceptive Emotive and Cognitive Systems (SPECS) Lab, Center of Autonomous Systems and Neurorobotics, Universitat Pompeu Fabra
(2)
Institució Catalana de Recerca i Estudis Avançats (ICREA)

References

  1. Arsiwalla, XD (2009) Entropy functions with 5d chern-simons terms. J High Energy Phys 2009(09): 059.ADSMathSciNetView ArticleGoogle Scholar
  2. Arsiwalla, XD, Betella A, Martínez E, Omedas P, Zucca R, Verschure P (2013) The Dynamic Connectome: a Tool for Large Scale 3D Reconstruction of Brain Activity in Real Time. In: Rekdalsbakken W, Bye R, Zhang H (eds)27th European Conference on Modeling and Simulation.. ECMS, Alesund (Norway). doi:http://dx.doi.org/107148/2013-0865-0869.
  3. Arsiwalla, XD, Dalmazzo D, Zucca R, Betella A, Brandi S, Martinez E, Omedas P, Verschure P (2015a) Connectomics to semantomics: Addressing the brain’s big data challenge. Procedia Comput Sci 53: 48–55.View ArticleGoogle Scholar
  4. Arsiwalla, XD, Herreros I, Moulin-Frier C, Sanchez M, Verschure PF (2016a) Is consciousness a control process? In: Nebot A, Binefa X, Lopez de Mantaras R (eds)Artificial Intelligence Research and Development, 233–238.. IOS Press, Amsterdam. doi:http://dx.doi.org/10.3233/978-1-61499-696-5-233.
  5. Arsiwalla, XD, Herreros I, Verschure P (2016b) On three categories of conscious machines. In: Lepora NF, Mura A, Mangan M, Verschure PF, Desmulliez M, Prescott TJ (eds)Biomimetic and Biohybrid Systems: 5th International Conference, Living Machines 2016, Edinburgh, UK, July 19–22, 2016. Proceedings, 389–392.. Springer International Publishing, Cham. doi:http://dx.doi.org/10.1007/978-3-319-42417-0_35.
  6. Arsiwalla, XD, Verschure PF (2013) Integrated information for large complex networks In: The 2013 International Joint Conference on Neural Networks (IJCNN), 1–7. doi:http://dx.doi.org/10.1109/IJCNN.2013.6706794.
  7. Arsiwalla, XD, Verschure P (2016a) Computing information integration in brain networks. In: Wierzbicki A, Brandes U, Schweitzer F, Pedreschi D (eds)Advances in Network Science: 12th International Conference and School, NetSci-X 2016, Wroclaw, Poland, January 11-13, 2016, Proceedings, 136–146.. Springer International Publishing, Cham. doi:http://dx.doi.org/10.1007/978-3-319-28361-6_11.
  8. Arsiwalla, XD, Verschure PF (2016b) High integrated information in complex networks near criticality. In: Villa AEP, Masulli P, Pons Rivero AJ (eds)Artificial Neural Networks and Machine Learning – ICANN 2016: 25th International Conference on Artificial Neural Networks, Barcelona, Spain, September 6–9, 2016, Proceedings, Part I, 184–191.. Springer International Publishing, Cham. doi:http://dx.doi.org/10.1007/978-3-319-44778-0_22.
  9. Arsiwalla, XD, Zucca R, Betella A, Martinez E, Dalmazzo D, Omedas P, Deco G, Verschure P (2015b) Network dynamics with brainx3: A large-scale simulation of the human brain network with real-time interaction. Front Neuroinformatics 9(2). doi:http://dx.doi.org/10.3389/fninf.2015.00002.
  10. Ay, N (2015) Information geometry on complexity and stochastic interaction. Entropy 17(4): 2432–2458.ADSMathSciNetView ArticleMATHGoogle Scholar
  11. Balduzzi, D, Tononi G (2008) Integrated information in discrete dynamical systems: motivation and theoretical framework. PLoS Comput Biol 4(6): e1000091.ADSView ArticleGoogle Scholar
  12. Balduzzi, D, Tononi G (2009) Qualia: the geometry of integrated information. PLoS Comput Biol 5(8): e1000462.ADSMathSciNetView ArticleGoogle Scholar
  13. Barrett, AB, Barnett L, Seth AK (2010) Multivariate granger causality and generalized variance. Phys Rev E 81(4): 041907.ADSMathSciNetView ArticleGoogle Scholar
  14. Barrett, AB, Seth AK (2011) Practical measures of integrated information for time-series data. PLoS Comput Biol 7(1): e1001052.ADSMathSciNetView ArticleGoogle Scholar
  15. Betella, A, Bueno EM, Kongsantad W, Zucca R, Arsiwalla XD, Omedas P, Verschure PF (2014a) Understanding large network datasets through embodied interaction in virtual reality In: Proceedings of the 2014 Virtual Reality International Conference, 23:1–23:7.. ACM, New York. doi:http://dx.doi.org/10.1145/2617841.2620711.
  16. Betella, A, Cetnarski R, Zucca R, Arsiwalla XD, Martinez E, Omedas P, Mura A, Verschure PFMJ (2014b) BrainX3: embodied exploration of neural data In: Proceedings of the 2014 Virtual Reality International Conference, 37:1–37:4.. ACM, Laval. doi:http://dx.doi.org/10.1145/2617841.2620726.
  17. Betella, A, Martínez E, Zucca R, Arsiwalla XD, Omedas P, Wierenga S, Mura A, Wagner J, Lingenfelser F, André E, et al (2013) Advanced interfaces to stem the data deluge in mixed reality: placing human (un) consciousness in the loop In: ACM SIGGRAPH 2013 Posters, 68:1–68:1.. ACM, New York. doi:http://dx.doi.org/10.1145/2503385.2503460.
  18. Edlund, JA, Chaumont N, Hintze A, Koch C, Tononi G, Adami C (2011) Integrated information increases with fitness in the evolution of animats. PLoS Comput Biol 7(10): e1002236.ADSMathSciNetView ArticleGoogle Scholar
  19. Galán, RF (2008) On how network architecture determines the dominant patterns of spontaneous neural activity. PLoS One 3(5): e2148.ADSView ArticleGoogle Scholar
  20. Griffith, V (2014) A principled infotheoretic phi-like measure. arXiv preprint arXiv:1401.0978.
  21. Griffith, V, Koch C (2014) Quantifying synergistic mutual information. In: Prokopenko M (ed)Guided Self-Organization: Inception, 159–190.. Springer Berlin Heidelberg, Berlin. doi:http://dx.doi.org/10.1007/978-3-642-53734-9_6.
  22. Hagmann, P, Cammoun L, Gigandet X, Meuli R, Honey CJ, Wedeen VJ, Sporns O (2008) Mapping the Structural Core of Human Cerebral Cortex. PLoS Biol 6(7): 15.View ArticleGoogle Scholar
  23. Honey, CJ, Sporns O, Cammoun L, Gigandet X, Thiran JP, Meuli R, Hagmann P (2009) Predicting human resting-state functional connectivity from structural connectivity. Proc Natl Acad Sci 106(6): 2035–2040.ADSView ArticleGoogle Scholar
  24. Krohn, S, Ostwald D (2016) Computing integrated information. arXiv preprint arXiv:1610.03627.
  25. Mediano, PA, Farah JC, Shanahan M (2016) Integrated information and metastability in systems of coupled oscillators. arXiv preprint arXiv:1606.08313.
  26. Oizumi, M, Albantakis L, Tononi G (2014) From the phenomenology to the mechanisms of consciousness: integrated information theory 3.0. PLoS Comput Biol 10(5): e1003588.
  27. Omedas, P, Betella A, Zucca R, Arsiwalla XD, Pacheco D, Wagner J, Lingenfelser F, Andre E, Mazzei D, Lanatá A, Tognetti A, de Rossi D, Grau A, Goldhoorn A, Guerra E, Alquezar R, Sanfeliu A, Verschure PFMJ (2014) Xim-engine: a software framework to support the development of interactive applications that uses conscious and unconscious reactions in immersive mixed reality In: Proceedings of the 2014 Virtual Reality International Conference, 26.. ACM, New York. doi:http://dx.doi.org/10.1145/2617841.2620714.
  28. Seth, AK (2005) Causal connectivity of evolved neural networks during behavior. Netw Comput Neural Syst 16(1): 35–54.MathSciNetView ArticleGoogle Scholar
  29. Tegmark, M (2016) Improved measures of integrated information. arXiv preprint arXiv:1601.02626.
  30. Tononi, G (2004) An information integration theory of consciousness. BMC Neurosci 5(1): 42.View ArticleGoogle Scholar
  31. Tononi, G (2012) Integrated information theory of consciousness: an updated account. Arch Ital Biol 150(2-3): 56–90.Google Scholar
  32. Tononi, G, Sporns O (2003) Measuring information integration. BMC Neurosci 4(1): 31.View ArticleGoogle Scholar
  33. Tononi, G, Sporns O, Edelman GM (1994) A measure for brain complexity: relating functional segregation and integration in the nervous system. Proc Natl Acad Sci 91(11): 5033–5037.ADSView ArticleGoogle Scholar
  34. Verschure, PF (2016) Synthetic consciousness: the distributed adaptive control perspective. Phil Trans R Soc B 371(1701): 20150448.View ArticleGoogle Scholar
  35. Wennekers, T, Ay N (2005) Stochastic interaction in associative nets. Neurocomputing 65: 387–392.View ArticleGoogle Scholar
  36. Wibral, M, Vicente R, Lizier JT (2014) Transfer entropy in neuroscience(Wibral M, Vicente R, Lizier JT, eds.). Springer Berlin Heidelberg, Berlin. doi:http://dx.doi.org/10.1007/978-3-642-54474-3_1.Google Scholar

Copyright

© The Author(s) 2016