Discovering a change point and piecewise linear structure in a time series of organoid networks via the iso-mirror

Recent advancements have been made in the development of cell-based in-vitro neuronal networks, or organoids. In order to better understand the network structure of these organoids, a super-selective algorithm has been proposed for inferring the effective connectivity networks from multi-electrode array data. In this paper, we apply a novel statistical method called spectral mirror estimation to the time series of inferred effective connectivity organoid networks. This method produces a one-dimensional iso-mirror representation of the dynamics of the time series of the networks which exhibits a piecewise linear structure. A classical change point algorithm is then applied to this representation, which successfully detects a change point coinciding with the neuroscientifically significant time inhibitory neurons start appearing and the percentage of astrocytes increases dramatically. This finding demonstrates the potential utility of applying the iso-mirror dynamic structure discovery method to inferred effective connectivity time series of organoid networks.


Introduction
Detecting structural changes in time series of networks is central to many modern network science applications.However, due to the complexity of temporal network data and the myriad possible aspects for potential structural change, this problem can be daunting.For discovering underlying dynamics in time series of networks, [1] proposes theory and methods for representing temporal network structure with a curve, or 'mirror', in low dimensional Euclidean space, enabling the use of classical change point detection algorithms.In this paper, we estimate the mirror for a time series of brain organoid connectivity networks and subsequently identify change points.Because the mirror estimation method requires a 1-1 vertex correspondence for the networks across time, we first demonstrate that the putative 1-1 correspondence obtained directly from data collection is sufficiently accurate by comparing it to the vertex correspondence obtained via graph matching [2].Thence, mirror estimation and manifold learning recovers a 1-dimensional piecewise linear 'iso-mirror' representation, with an evident slope change.By using the change point detection algorithm from [3] and break point estimation for piecewise linear models from [4], we identify a change points neuroscientific significance coinciding with development stages.
We organize this paper as follows.In Section Organoids we introduce our brain organoids data and the extraction of effective connectivity networks based on extracellular electrophysiology recordings.In Section Graph Matching we define the graph matching problem and present the fast approximate assignment algorithm.In Section Discovering underlying dynamics in times series of networks we introduce the mirror estimation method and relevant model assumptions.In the Results section we apply these methods to the time series of organoid networks and present the graph matching results and the change point detection results.We conclude the paper with a discussion.

Organoids
The brain organoids we consider are self-organizing structures composed of roughly 2.5 million neural cells.They are generated from human induced pluripotent stem cells (hiPSCs) [5].After growing for 6 weeks, they are plated in 8 wells of a multielectrode array (MEA) plate (Axion Biosystem, Atlanta,GA, USA).MEA contains 64 low-impedance (0.04 MU) platinum microelectrodes with 30 µm of diameter spaced by 200 µm.Each well contains two or three organoids.Then, to characterize the functional development of the organoids, extracellular spontaneous electrical activity is recorded weekly using Maestro MEA system and AxIS Software Spontaneous Neural Configuration (Axion Biosystems) with a 0.1-Hz to 5-kHz band-pass filter.Then spikes are detected with an adaptive threshold crossing set to 5.5 times the standard deviation of the estimated noise for each electrode.Each time series consists of five minutes of recorded neural activity across 10 months and the data are recorded irregularly -not exactly once a week.Cortical organoids show low and sparse activity during the first 2 months with an average firing frequency of 0.5-Hz, then they start exhibiting highly synchronized and stereotypical network activity which transitions into 2-Hz and 3-Hz rhythmic activity by 4-6 months.At later stages (6 to 9 months), the activity includes high-rate spiking with peak of activity reaching a 20-Hz pace and highly complex bursting behaviors with cross-frequency coupling [6].A combination of principal component analysis and k-means clustering is used to spike sort the multi-unit activity from the 64 electrodes of each well.The average number of neurons detected in each well increases with the maturation of the organoids onto the electrodes, sometime reaching saturation after 6 months.Similar behavior is observed in all MEA wells.For example, the number of spikesorted neurons in one well is detected as 122, 160, 189, 174, 190 at 2, 4, 6, 8 and 10 months.The other well has 81, 160, 174, 171, 170 neurons detected at 2, 4, 6, 8, and 10 months.
To infer the effective connections between neurons -i.e., the adjacency matrices with neurons as vertices across time from the spike activity data -the algorithm proposed in [6] is applied.This algorithm uses a super-selection rule to individuate and discard correlation peaks corresponding to apparent and indirect interactions, and reconstructs the effective connectivity of the network considering the remaining correlation delays.In the end, 45 effective connectivity networks on 127 vertices across 244 days are obtained.This is our time series of organoid networks.

Graph Matching
Given a time series of networks G 1 , • • • , G T on the same set of vertices V , a 1-1 vertex correspondence across all the networks facilitates joint spectral embedding, which is a key step in the mirror estimation method we will present in the next section, "Discovering underlying dynamics in time series of networks".Such a vertex correspondence may be available a priori in labeled networks, or it can be inferred from unlabeled networks.This inference problem -the so-called graph matching problem -is to find an alignment of vertices between two graphs such that the corresponding edge differences are minimized.We denote A, B as two n × n adjacency matrices for two graphs with n vertices each.Then the graph matching problem is to find the permutation matrix P that maximizes the objective function Because solving this optimization problem is combinatorically difficult, approximation algorithms have been proposed.We use the Fast Approximate Quadratic (FAQ) assignment algorithm [2] to obtain an approximate solution.FAQ iteratively finds a local solution to the relaxed problem -expanding the constraint set to the convex hull of P -and then projects the solution back to P, and has been shown empirically to be competitive with or superior to other state-of-art methods.In Section Results we use FAQ to demonstrate that the given putative 1-1 correspondence for each pair of networks is close to the solution to the corresponding graph matching problem.

Discovering underlying dynamics in time series of networks
To discover the underlying dynamics in time series of networks, we use the model and method proposed in [1].First we introduce the generative joint model for time series of networks.We consider T networks, each containing n vertices, with adjacency matrices A t , t ∈ {1, 2, ..., T }.In the model, each vertex is associated with a time varying d-dimensional latent vector.These vectors X t are each one realization from a stochastic process, called the latent position process (LPP) -each X t is a d-dimensional random variable.For n vertices, we generate n i.i.d.samples from the LPP, which collection then forms the latent position matrices {X t }, where X t ∈ R n×d for t ∈ {1, 2, ..., T }.The connection probability between vertex i and vertex j at time t is the inner product of the associated latent vectors at time t.That is E(A t ) = X t X T t .Note that each network corresponds to a latent position random variable.Thus we can capture the distance between graphs using the corresponding random variables.We define the distance where O d×d is the set of orthogonal transformation matrices with dimension d.When X t and X t are centered, the d M V distance can be interpreted as the maximum directional variation for the random vector X t − W X t , where W is an orthogonal transformation used to align X t and X t .We evaluate this distance for every pair of random variables in the LPP and obtain a T ×T distance matrix D. Then we apply classical multidimensional scaling (MDS) [7] to D to get a low-dimensional Euclidean representation of underlying network structure, called the mirror, {ψ(t)}.In practice, the LPP is unknown and only network realizations {A t } are observed.For the n × n symmetrized adjacency matrix A t , we use adjacency spectral embedding (ASE) [8] to obtain Xt = U Σ 1/2 , where the diagonal matrix Σ contains the top d eigenvalues of A t and U contains the associated eigenvectors.Then we use to estimate the pairwise distance between networks, yielding D. Applying MDS to D yields the mirror estimate { ψ(t)}.When the mirror exhibits a manifold structure, we can further simplify the change point detection problem by applying the manifold learning method isometric mapping (ISOMAP) [9] to { ψ(t)}.This yields the isomirror, which captures the geodesic distance along the mirror and preserves it in lower dimensions with Euclidean distance.Subsequent inference is then performed using the iso-mirror representation.For convenience, the iso-mirror representation will also be denoted as { ψ(t)}.

Results
Time series of organoid networks data All results in this section are based on data collected from well 8.For analogous results from well 5, please refer to the Appendix.For well 8, the time series of organoid networks consists of 45 time stamps {1, 2, ..., 45}; each time stamp corresponds an effective connectivity graph G t with adjacency matrix A t .Each graph is directed, weighted, and hollow.We symmetrize the directed graphs, and use ranks in place of the raw edge weights.All graphs have the same vertex set V = {1, 2, ...n} with n = |V | = 127, although some of the graphs contain isolated vertices.See Figure 1.

Putative 1-1 correspondence
For this time series of organoid networks, inferred neuron location gives a putative 1-1 vertex correspondence across the graphs.We assess this correspondence via graph matching using the objective function value (OFV) f (A, B; P ).The OFV for the putative 1-1 correspondence is given by f (A, B; I) where I is the identity matrix.We denote the FAQ output for matching adjacency matrices A and B initialized at C as P A,B;C .Typically we choose the barycenter b = 11 T n as the initial point.For all times i ∈ {1, 2, ...44}, we consider A i , A i+1 and FAQ yields P Ai,Ai+1;b (denoted P i,i+1;b for short).In Figure 2 we compare f (I) = f (A i , A i+1 ; I) and f (P i,i+1;b ) = f (A i , A i+1 ; P i,i+1;b ).Although f (P i,i+1;b ) is always larger than f (I), the two OFVs are close to each other for all time stamps, indicating that the putative 1-1 correspondence is close to FAQ's solution.To further asses the putative 1-1 correspondence, we consider a specific pair of adjacency matrices: A 28 , A 29 .We uniformly generate 100,000 random permutation matrices R and evaluate f (R) = f (A 28 , A 29 ; R).We plot the histogram in Figure 3.We also indicate f (A 28 , A 29 ; I), f (A 28 , A 29 ; P 28,29;I ), f (A 28 , A 29 ; P 28,29;b ) and f (A 28 , A 29 ; P 28,29;R ), where R are 100 randomly drawn permutation matrices.We see that the putative 1-1 correspondence performs better than all 100,000 instantiations of f (R) and is close to P 28,29;b , P 28,29;I and P 28,29;R .Thus we conclude that the putative 1-1 correspondence is sufficiently accurate, and we will proceed apace for mirror estimation and change point detection.

Change point detection
For the 45 graphs, time stamps are from 1 to 244, in days.We choose time stamps in [150,230] to avoid growth and death regimes.
For these graphs, we find the largest common connected component, which contains 112 vertices.The average number of edges for the largest common connected component is approximately 6130.We use the putative 1-1 vertex correspondence across time.We apply our mirror estimation method to this time series of networks, and ISOMAP manifold learning yields the 1-dimensional representation of the dynamics { ψt }.We choose 10 time stamps shown in Figure 4.As we see, the representation is approximately piecewise linear with an evident change of slope at t = 4, day 188.
Assuming the true underlying 1-dimensional representation ψ(t), t ∈ [0, T ], is piecewise linear and continuous, it is natural to define the change point t * as the point when the slope changes.If we assume there is only one change point, then we can write .
Both the change point detection algorithm from [3] and the break point estimation for piecewise linear models from [4] yield an estimated change point t * = 4, day 188, which coincides with neuroscientifically significant developmental changesinhibitory neurons start appearing and the percentage of astrocytes increases dramatically -as described in [10].Note that the emergence of astrocytes does not happen at once but builds over time, so there is no one precise date for the change point and detection of a time coinciding with this change can be but suggestive.

Conclusion
Reconstruction of effective connectivity networks of electrophysiologically active brain organoids reflect their structural (increasing number of neurons (nodes) and connections (edges)) and electrical development over time, as previously demonstrated in [10].By applying the spectral mirror estimation method to the time series of organoid networks, we obtain a 1-dimensional iso-mirror representation of dynamic inferred effective connectivity organoid networks.Two change point detection algorithms successfully detect a change at day 188.At approximately 188 days (˜6 months), cortical organoids start showing inhibitory neurons and the percentage of astrocytes increases from 5% to 30-40% [10] resulting in added complexity in the activity and network distribution of brain organoids.
There are several change point detection algorithms available for analyzing time series of graphs, including the one proposed in [11].However, it is important to note that the spectral mirror estimation method used in our study is not restricted to change point detection alone.Instead, it provides a low-dimensional (in our case, one-dimensional Euclidean) representation of network dynamics, enabling us to visualize network evolution.As illustrated in Figure 4, we observe piecewise linear structure and apply segment regression to identify a significant increase in slope after day 188.This suggests that organoid graphs drift continuously over time, but their rate of drift accelerates significantly after day 188.Our method is preferred in this regard as it provides us with more than just one change point.Additionally, since the iso-mirror represents the underlying LPP, the detected change point in the iso-mirror reflects a fundamental change in the underlying generative LPP of the time series of networks, which may not necessarily correspond to any specific network measure.Furthermore, the spectral mirror method can be applied to other dynamic graphs that meet our model assumptions, namely that the time series of networks are generated from a time series of latent position random networks whose vertices have independent, identically distributed latent positions given by an LPP.
Future work includes addressing two major theoretical issues of note, to make the change point inference formally principled.First, the theory in [1] requires a known 1-1 vertex correspondence across time.It remains to study the effect of errors in this correspondence, such as those inherent in our putative 1-1 correspondence deemed sufficiently accurate for practical purposes.In addition, it remains to study the entry-wise behavior of the error term (t) = ψ(t) − ψ(t).For example: in [3] the proof of consistency of the change point estimator requires the error process to be stationary; in [4] the (t) are assumed to be i.i.d.normal to construct a confidence interval for t * .For now, [1] has shown that ψ(t) converges to ψ(t) in Frobenius norm, that is T t=1 ( (t)) 2 → 0 with high probability, which is insufficient to conclude normality or stationarity.What's more, this convergence result is for the mirror rather than the iso-mirror.As for whether the same conclusion can be extended to the iso-mirror, further investigation is needed.

Appendix
Time series of organoid networks data for well 5 For well 5, there are 40 graphs with time stamps [1,229] and all of them have the same vertex set with |V | = 128, although some of the graphs contain isolated vertices.See figure 5.Each graph is directed, weighted, and hollow.We symmetrize the directed graphs, and use ranks in place of the raw edge weights.Putative 1-1 correspondence for well 5 We assess the putative 1-1 correspondence exactly the same way as in the paper and Figure 6 indicates that the putative 1-1 correspondence is close to FAQ's solution.
We also consider a specific pair of adjacency matrices: A 28 , A 29 to assess the 1-1 putative correspondence.See figure 7. The result is similar to well 8 and we conclude the 1-1 putative correspondence is accurate enough. .

Change point detection for well 5
For the 40 graphs, time stamps are from 1 to 244, in days.We choose the same time stamps in [150,230] as in the paper.We find the largest common connected component for these graphs.We use the putative 1-1 vertex correspondence across time.We apply our mirror estimation method to this time series of networks, and ISOMAP manifold learning yields the 1-dimensional representation of the dynamics { ψt }.We choose the same 10 time stamps as in the paper shown in

Figure 1
Figure1The number of non-isolated vertices and the number of edges for the graphs at each of the 45 time stamps.The number of edges are counted after symmetrizing the directed graph.

Figure 2
Figure 2 Comparison of OFV using I and P i,i+1;b for 44 pairs of graphs, demonstrating that FAQ increases the OFV only slightly.(a): f (I) and f (P i,i+1;b ).(b):

Figure 3
Figure 3 Matching A 28 and A 29 .Comparison of OFV for using I,R, P 28,29;b , P 28,29;I and P 28,29;R .(a): Histogram demonstrating that OFV for R is not nearly as good as for the others.(b): Enlargement of the far right portion of the top figure, demonstrating that FAQ output at different initial points I, b and R improves the OFV only slightly compared to the putative 1-1, I.

Figure 4
Figure 4 Discovering a change point in time series of inferred effective connectivity organoid networks via the iso-mirror.The x-axis is time stamp, in days.The y-axis is the ISOMAP representation of the estimated mirror { ψt}.This indicates a change in the network dynamics at day 188.

Figure 5
Figure5The number of non-isolated vertices and the number of edges for the graphs at each of the 40 time stamps for well 5.The number of edges are counted after symmetrizing the directed graph.

Figure 6
Figure 6 Comparison of OFV using I and P i,i+1;b for 39 pairs of graphs, demonstrating that FAQ increases the OFV only slightly.(a): f (I) and f (P i,i+1;b ).(b):

Figure 8 .Figure 7 Figure 8
Figure 7 Matching A 28 and A 29 .Comparison of OFV for using I,R, P 28,29;b , P 28,29;I and P 28,29;R .Left: Histogram demonstrating that OFV for R is not nearly as good as for the others.Right: Enlargement of the far right portion of the left figure, demonstrating that FAQ output at different initial points I, b and R improves the OFV only slightly compared to the putative 1-1, I.