# Dynamic graphs, community detection, and Riemannian geometry

- Craig Bakker
^{1}Email authorView ORCID ID profile, - Mahantesh Halappanavar
^{1}and - Arun Visweswara Sathanur
^{1}

**Received: **31 October 2017

**Accepted: **26 February 2018

**Published: **29 March 2018

## Abstract

A community is a subset of a wider network where the members of that subset are more strongly connected to each other than they are to the rest of the network. In this paper, we consider the problem of identifying and tracking communities in graphs that change over time – dynamic community detection – and present a framework based on Riemannian geometry to aid in this task. Our framework currently supports several important operations such as interpolating between and averaging over graph snapshots. We compare these Riemannian methods with entry-wise linear interpolation and find that the Riemannian methods are generally better suited to dynamic community detection. Next steps with the Riemannian framework include producing a Riemannian least-squares regression method for working with noisy data and developing support methods, such as spectral sparsification, to improve the scalability of our current methods.

## Keywords

## Introduction

### Graphs and dynamic community detection

Community detection is an important activity in graph analytics with applications in numerous scientific and technological domains (Girvan and Newman 2002). Given a graph *G*=(*V*,*E*) with weight function *w*:*E*→ℜ^{+}, the goal of community detection (or graph clustering) is to partition the vertex set *V* into an arbitrary number of disjoint subsets of *V* called communities (or clusters) such that the vertices within a community are tightly connected with each other but sparsely connected with the rest of the graph. Clustering on *G* can be represented as *C*(*G*), which is a unique mapping of each vertex to a community. We restrict our work here to undirected, unweighted graphs and to the disjoint partitioning of vertices into communities. For a detailed treatment of this topic, the reader is referred to the work by Fortunato (2010).

The relationships between entities in domains such as sociology, finance, cybersecurity and biology are most naturally modeled with the use of graphs. The inherently dynamic nature of such data (Fenn et al. 2012) leads to dynamic graph representations. A dynamic graph changes over time through the addition and deletion of vertices and edges. A snapshot of this graph, *G*_{
n
}, consists of the vertices and edges that are active at a given time step *n*. Modifications from time *n* to *n*+1 are represented by *Δ**G*_{
n
}. Clustering can be performed at each time step, *C*(*G*_{
n
}), and as the graph evolves, so do its communities. Temporal communities can undergo several different transitions: *growth* via addition of new vertices, *contraction* via deletion of vertices, *merging* of two or more communities, *splitting* of a community into two or more communities, *birth* and *death* of a community, and *resurgence* or reappearance of a community after a period of time. Efficiently detecting these transitions is a challenging problem.

The problem of dynamic community detection has received significant interest in the academic literature (Cazabet and Amblard 2014). Current approaches for dynamic community detection broadly fall under two headings: incremental community detection and global community detection. The approaches in the first category focus on the systematic propagation of communities through time, whereas the approaches in the second category attempt to simultaneously optimize for multiple metrics on several snapshots of data. Stability of computation and accuracy of results are the fundamental limitations of the incremental approaches, while memory (space) and computational requirements are the main limitations of the global approaches (Cazabet and Amblard 2014). Incremental approaches are fundamentally combinatorial in nature (Tantipathananandh and Berger-Wolf 2011; Nguyen et al. 2014) and involve methods to track communities through time. The stochastic nature of these algorithms makes these methods unstable leading to inaccurate results. Mucha et al. (2010) build on the seminal work of Lambiotte et al. (2014) for community detection in dynamic multiplex networks by specializing null models in terms of stability under Laplacian dynamics.

### Motivation for a Riemannian framework

There is a well-developed suite of methods for community detection in static graphs, but it is not always clear how to extend those methods to dynamic graphs in a way that captures the time-varying nature of those graphs’ communities. The challenge is to develop methods that vary continuously in time, like the graphs themselves, between snapshots. Moreover, if existing methods are extended through time, it will be beneficial to do so in a way that provides new insight or analytical tools as well. With that in mind, we propose a Riemannian geometry approach that views dynamic graphs (and thus dynamic communities) through the lens of Laplacian dynamics on a matrix manifold. Riemannian geometry provides ways of calculating quantities such as distances between Laplacians and trajectory speeds on the matrix manifold. As such, it provides a clear and consistent way of representing graph dynamics. This framework is also modular with respect to existing static community detection methods.

### Contributions

In this paper, we provide the background theory needed to describe dynamic graphs in terms of Laplacian dynamics on matrix manifolds. The primary contribution of this paper is to bring existing theory to bear on a new application area – dynamic community detection. We use Riemannian geometry to interpolate between snapshots of dynamic graphs (using geodesics) and to calculate averages of those snapshots; we explicitly show the formulae for performing these calculations. The interpolated and average graphs are then amenable to existing static community detection methods. This allows us to use a consistent approach to track community behaviour both between snapshots, via interpolation, and across snapshots, via averaging.

Simply transferring previously derived formulae would not allow us to consider disconnected graphs, however, so our contributions also include a way of transforming disconnected graphs so that they are amenable to the matrix manifold tools. Using both synthetic and experimental graph data, we experimentally evaluate two different kinds of geodesics. We identify their strengths, as compared with entry-wise linear interpolation, and also discuss their weaknesses. Finally, we derive interpolation and extrapolation error bounds for both geodesics (shown in the Appendix) and identify promising avenues of future research in this area.

Our framework enables more accurate prediction of community transitions by building interpolated graphs between snapshots, global community detection through data aggregation, and prediction of future behaviour through extrapolation from given snapshots. We describe the basics of our framework in the “Riemannian geometry and dynamic graphs” section, show how it can be applied to dynamic clustering in “A Riemannian framework for dynamic community detection” section, and compare the Riemannian methods with an entry-wise linear approach on synthetic and real network data in the “Computational experiments” section.

The novelty of our approach arises primarily from the application of Riemannian geometry to dynamic graphs. When combined with existing spectral methods, this also provides a new interpretation of community splitting and merging as bifurcations in a gradient flow dynamical system (see the “Dynamic spectral clustering” section). To the best of our knowledge, the Riemannian framework presented in this paper is the first of its kind; it is our intent that the research community build from and extend this work to enable features of dynamic community detection not currently considered here.

## Riemannian geometry and dynamic graphs

### Riemannian geometry and matrix manifolds

Differential geometry deals with mathematics on manifolds; manifolds are spaces that are locally Euclidean (i.e., flat), but generally non-Euclidean globally (Boothby 1986). A Riemannian manifold is a type of manifold that has a metric associated with each point on the manifold. The traditional methods for calculating angles and distances in flat spaces have to be modified on manifolds to account for manifold curvature, and the metric is an integral part of those modifications on Riemannian manifolds.

A key part of Riemannian geometry, for the purposes of this paper, is the geodesic. Geodesics are the equivalent of straight lines in curved spaces. A geodesic is (locally) the shortest path between two points. Great circles on a sphere are examples of geodesics on a curved manifold. Consider a flight from Vancouver, Canada to London, England: the two cities are at similar latitudes, so on a Mercator projection map, the shortest flight would seem to be a straight West-to-East trajectory. In reality, however, flights between the two cities traverse the Pole because that is a shorter route – it is the great circle route. The discrepancy is due to the curvature of the Earth, which is distorted on a flat map. From another perspective, a geodesic is the path that a particle on a manifold would take if it were not subject to external forcing; a geodesic with constant speed has zero acceleration.

Riemannian geometry can be applied to matrix manifolds. The Grassman and Stiefel manifolds are perhaps the most frequently encountered matrix manifolds in differential geometry because they have closed-form solutions for quantities such as geodesics (Absil et al. 2007). Pennec et al. (2006) developed a metric for the manifold of symmetric positive-definite matrices with corresponding expressions for distances, geodesics, and tangent vector inner products *in closed form*. These formulae are valuable because even when there is a well-defined metric on a manifold, distances and geodesics between points do not usually have closed-form expressions. Such quantities have to be solved for numerically. Working on this matrix manifold, when appropriate, can be useful: matrix symmetry provides a reduction in effective dimension, and properties such as symmetry and positive-definiteness are automatically preserved.

Bonnabel and Sepulchre (2009) extended this framework to include symmetric positive-semidefinite matrices. The extension essentially worked by decomposing a positive-semidefinite matrix into a nullspace component (a Grassman manifold) and a positive-definite component, which could then use the existing metric.

### Graph Laplacians and Riemannian geometry

Researchers have previously used non-Euclidean geometries to investigate graphs (Krioukov et al. 2009, 2010). That work has then been applied to large-scale networks such as the internet (Boguná et al. 2010). The approach described in this paper differs in a subtle but meaningful way. In those papers, the mappings used treat graph nodes as points in a hyperbolic space. Our present work, however, treats the entire graph as a single point in a non-Euclidean space.

The work of Bonnabel and Sepulchre (2009) combined with that of Pennec et al. (2006) enables us to consider graph Laplacians as points on a manifold of positive-semidefinite matrices. Each graph is a point, and thus a time-indexed sequence of graphs forms a trajectory on the manifold. This, in turn, means that we can calculate quantities such as trajectory velocities, distances between graphs (represented by manifold distances between their respective points), and relevant geodesics.

Given that we are interested in dynamic community detection, the Laplacian is a natural object to work with. The Laplacian uniquely defines a graph (up to self-loops), and there is already a known connection between the Laplacian spectrum and community structure (Newman 2010). Previous work in dynamic community detection (e.g., Mucha et al. (2010)) has also worked with the Laplacian. Graph Laplacians have a certain structure that make them amenable to the Riemannian geometry techniques presented here as well: Laplacians are symmetric (for undirected graphs) and positive-semidefinite. Adjacency matrices, for example, are generally indefinite and thus would not be suitable for use with the matrix manifolds described here.

We chose to work with the combinatorial Laplacian, *L*=*D*−*A*, because it has a constant nullspace for connected graphs (Newman 2010). This constant nullspace makes the geometric calculations much simpler than they would be otherwise. It is possible to use other Laplacians, such as the normalized Laplacian. If these Laplacians do not have constant nullspaces, though, the interpolation involves extra calculations (detailed by Bonnabel and Sepulchre (2009)). Assuming no self-loops, the combinatorial Laplacian also has the virtue of being easy to convert into an adjacency matrix. That being said, as long as a Laplacian is symmetric positive-semidefinite and has a constant nullspace dimension (for connected graphs), it is possible to calculate geodesic interpolations for that Laplacian.

There are two other relevant considerations we wish to address here. Firstly, the Laplacians of unweighted graphs constitute a discrete (and therefore sparse) subset of the matrix manifold. As such, any *continuous* trajectory will contain weighted graphs. Secondly, directed graphs do not have symmetric Laplacians, and thus they cannot be considered within this framework without symmetrizing them somehow (e.g., by ignoring the directionality of edges). For the purpose of community detection, though, edge direction may not be important.

## A Riemannian framework for dynamic community detection

There are two primary components to our framework. The first involves modelling and analyzing the dynamic behaviour of the graph prior to any community detection. For this, we show how to calculate an average graph from a collection of snapshots (for use in a time-averaged community detection) and how to interpolate between time-indexed graph snapshots (for seeing how the graph evolves over time). In the Appendix, we derive and analyze bounds on the interpolation error in terms of distance on the manifold.

The second component consists of applying community detection methods to the dynamic graph. In this paper, we will focus on spectral methods, because they have convenient properties under continuous Laplacian dynamics, and the Louvain method (Blondel et al. 2008), because of its computational speed and ability to handle disconnected graphs. However, the Riemannian geometry methods do not require using any one particular community detection method.

### Graph interpolation and averaging

We begin with interpolation between two snapshots. It is possible to do this using an entry-wise linear approach, *L*(*t*)=(1−*t*)*L*_{
A
}+*t**L*_{
B
}, but there are good reasons not to use this approach.

Firstly, the Laplacians for a given dynamic graph all exist on a matrix manifold. For the trajectory *L*(*t*) on that manifold, though, the trajectory speed is not constant, the trajectory direction is not constant, and it is not the shortest path from *L*_{
A
} to *L*_{
B
}. It is precisely analogous to the Mercator projection map example given earlier – moving at a constant velocity (i.e., constant speed and direction) on the map would not correspond to moving at a constant velocity on the earth because of the earth’s curvature. Experimentally, we have observed that the linear interpolation begins and ends its trajectory moving very quickly while the bulk of its trajectory moves relatively slowly. The difference between maximum and minimum velocities can be orders of magnitude, depending on the size of the graph and the distance between the two graphs being interpolated.

Secondly, in connected graphs, the product of the Laplacian’s non-zero eigenvalues (i.e., the determinant of the positive-definite component) is concave along the linearly interpolated trajectory. If the two points are far enough apart, this product will go through a maximum between the two points. This maximum can, again, be orders of magnitude greater than the product at either endpoint; like the trajectory velocity, this variation will depend on the size of the graphs in question and their distance apart. The geodesic interpolation, however, provides a linear variation in the product of the eigenvalues. Pennec et al. (2006) comment on this in more detail. For a graph, this product relates directly, by Kirchoff’s matrix tree theorem, to the number of spanning trees in the graph (Harris et al. 2008). In other words, the linear interpolation increases the overall connectivity of the graph between snapshots.

Finally, the linear interpolation cannot always be used for extrapolation. All of the interpolated Laplacians are positive-semidefinite, but it is easy to provide examples where the extrapolation quickly becomes indefinite.

Instead, we propose using geodesic interpolation. A geodesic interpolation trajectory has a constant velocity, produces an eigenvalue product that varies linearly between endpoints that are connected graphs, and can be extrapolated indefinitely without leaving the manifold of positive-semidefinite manifolds (with constant nullspace dimension). Following Bonnabel and Sepulchre (2009), we show how to calculate this geodesic between two snapshots of a given dynamic graph.

*L*at a point. It can be represented with its eigendecomposition:

where the columns of *α* span the range of *L*. Moreover, the nullspace, *ξ*, is always parallel to (1,1,…,1), and thus span(*α*) is constant even though *α* may not be, in general.

*L*

_{ A }and

*L*

_{ B }. We can calculate the SVD of \(\alpha _{B}^{T} \alpha _{A}\):

*σ*

_{ AB }has the principal angles between the subspaces spanned by

*α*

_{ A }and

*α*

_{ B }as its diagonal entries. Since those subspaces are the same,

*σ*

_{ AB }=

*I*for any two Laplacians. We then calculate

*U*

_{ A }=

*α*

_{ A }

*O*

_{ A }and

*U*

_{ B }=

*α*

_{ B }

*O*

_{ B }. Since

*σ*

_{ AB }is constant,

*U*

_{ A }=

*U*

_{ B }, and

*U*is constant for all points on the geodesic;

*α*and

*O*are not constant, though. Furthermore, we can use the same

*U*matrix for any Laplacian of a given dynamic graph without affecting our calculations, because the span of

*U*is constant. We calculate

*R*=

*U*

^{ T }

*L*

*U*for

*L*

_{ A }and

*L*

_{ B }. The geodesic from

*L*

_{ A }at

*t*=0 to

*L*

_{ B }at

*t*=1 is then

If there are multiple time-sequenced snapshots, this method can be used to do a piecewise geodesic interpolation with *t* being shifted and scaled appropriately. Note that the constant Laplacian nullspace means that we can work solely with the *R* components of *L* and ignore the Grassman component. We can also extrapolate with this geodesic simply by continuing the trajectory for *t*>1.

*R*matrices derived from the graph Laplacians as before; each graph

*i*has a matrix

*R*

_{ i }associated with it, and we want to determine the ‘average’ matrix

*S*for

*N*snapshots. We then list the sum-of-squared-distance function, the distance function itself, and the gradient of the squared distance (Pennec et al. 2006), respectively:

According to Pennec et al. (2006), this usually converges quickly.

### Alternative Riemannian geometries

To utilize these formulae for interpolating between graphs, we would simply replace Eq. 6 with Eqs. 9, 3 with Eq. 10, and the iterated process in Eq. 8 with a single evaluation of Eq. 11. There are other expressions that are simpler to evaluate for the affine-invariant metric, but those quantities may not be needed, and the different invariance properties of each metric may be valuable in different circumstances.

On a practical level, the two metrics generally produce similar interpolations (Arsigny et al. 2007): the spectrum of the affine-invariant interpolations tends to be slightly more isotropic than that produces by the log-Euclidean interpolations, but both interpolate determinants linearly between interpolation points (see the “Graph interpolation and averaging” section). For the rest of this paper, we will distinguish the geodesics and means calculated with the two methods as being either affine-invariant (AI) geodesics or log-Euclidean (LE).

### Disconnected graphs

The methods described in this paper currently assume that the graph in question is connected and remains so at all points of interest. As they stand, they could potentially handle a graph with a constant number of disconnected components (which would correspond to the Laplacian nullspace having a constant dimension), but this does not significantly improve the method’s generality. In order to be widely applicable, the interpolation methods need to be able to handle changing connectivity.

We can accommodate this by using a bias term with, potentially, a thresholding procedure. For a given adjacency matrix *A*, we add to each off-diagonal entry a bias term *ε*/*n*, where *ε*≪1 and *n* is the number of vertices in the graph, to produce a biased adjacency matrix \(\tilde {A}\) (which is now connected). We then construct a biased Laplacian matrix from \(\tilde {A}\), perform the interpolation on the biased Laplacian and subtract *ε*/*n* from each off-diagonal entry of the adjacency matrices produced by the biased interpolation. If need be, we can then apply a threshold to the resulting adjacency matrices or round those matrices to an appropriate number of decimal places. This approach essentially replaces the Laplacian’s *λ*=0 eigenvalues with *λ*=*ε*.

Empirically, we found that this approach did not significantly change the interpolated trajectories for connected graphs while also producing reasonable results for disconnected graphs. If we consider the properties of the Riemannian metrics discussed in this paper, we can see why adding this small bias would not significantly disturb a geodesic trajectory. With these metrics, matrices with zero or infinite eigenvalues essentially exist at infinity. For matrices with finite eigenvalues greater than zero, the distances between matrices are relative and directly tied to the matrices’ spectra. For example, the distance from *λ*=10^{−6} to *λ*=10^{−5} is comparable to the distance from *λ*=1 to *λ*=10. This means that a geodesic, which is a minimum-distance path between points, will not significantly alter the part of the spectrum associated with *λ*=*ε* values unless it is absolutely necessary to do so in order to reach the destination. Moreover, adding a fully connected graph with edge weights of *ε* would not meaningfully change the community structure because of the separation of scales (presuming a very small value of *ε*).

In our computational experiments, we found that *ε*=10^{−6} provided a good balance between avoiding ill-conditioning and keeping *ε* small, but even increasing *ε* to 10^{−3} did not change the interpolation significantly. As we increased *ε*, though, we found that the geodesic interpolations approached the trajectory of the linear interpolation; at, say, *ε*=10^{6}, they were almost identical. This, too, makes sense: as the eigenvalues become uniformly larger, the manifold becomes flatter, and the differences between the data points become smaller. The flatter the manifold, the closer the geodesic is to the linear interpolation. However, the geodesic interpolation is still guaranteed to remain positive definite, and the linear interpolation is not. This suggests that if the linear interpolation were more desirable in a particular application but the application also called for the use of extrapolation, then using a geodesic with a large bias term could provide the desired capabilities.

### Dynamic spectral clustering

It is possible to use spectral clustering with the first non-trivial eigenvector for community detection, but this method can be improved upon by using multiple eigenvectors (Boccaletti et al. 2006). This approach is convenient for continuous Laplacian dynamics because as long as the eigenvalues are distinct, we can expect the eigenvectors and eigenvalues to vary smoothly with smooth changes in *L*. If the eigenvalues of the eigenvectors in question are not distinct, then the eigenvectors are not uniquely defined, and if eigenvalues whose eigenvectors are being used for spectral clustering cross during the course of a trajectory, the spectral clustering may experience a discontinuous jump. Disconnected graphs can provide exactly this kind of behaviour (e.g., with multiple zero eigenvalues). Moreover, if the number of disconnected components is not constant, then it will not suffice simply to consider the first *m* non-zero eigenvalues, for the set of such eigenvalues will not be constant.

Assume that the graphs are connected, that there is an ordering of the eigenvalues of *L* such that *λ*_{
i
}≤*λ*_{i+1}, *λ*_{1}=0, and that eigenvector ξ_{(i)} is associated with *λ*_{
i
}. We can then plot each of the graph nodes in ℜ^{
n
}, where node *k* has coordinates given by \(\left (\xi ^{k}_{(2)},\xi ^{k}_{(3)}, \ldots, \xi ^{k}_{\left (n+1\right)}\right)\), and use clustering techniques to identify communities.

Changes in the graph’s communities can then be seen as changes in the density function. The density of a cluster is proportionate to the magnitude of the density function at the peak (i.e., the cluster centroid). Community growth and contraction can be seen by points traversing community boundaries (i.e., separatrices). Birth and death correspond to the emergence or disappearance of a peak in the density function. Merging and splitting correspond to the merging and splitting, respectively, of the density function peaks. This splitting and merging correspond very closely to pitchfork bifurcations in dynamical systems; more precisely, the pitchfork bifurcation happens to the gradient flow \(\dot {\mathbf {x}} = \nabla f\). Birth and death also correspond to pitchfork bifurcations, but this is not as immediately obvious. It is a corollary of the Poincaré-Hopf theorem: creating a new maximum results in the creation of additional saddle points and/or minima (Domokos et al. 2012). To identify death, merging, or splitting, we can track the Hessian of *f*. If it becomes singular at a point, that is an indication of a potential bifurcation there. Birth may be identified in the same way, but searching the space for such a phenomenon may be more difficult than simply tracking known maxima and monitoring the Hessian at those points.

Once the spectrum has been plotted, techniques such as *k*-means clustering can identify communities. This should produce a sufficient approximation of the separatrices between maxima. However, if two eigenvectors are used, it may even be easier to identify communities visually.

## Computational experiments

### Implementation and testing procedure

To demonstrate our methods, we initially created a series of graph snapshots using a synthetic graph process. The dataset was created by generating two Erdős-Rényi (ER) random graphs with 100 nodes each, as representing distinct communities, with edge probabilities of *p*_{
E
}=0.15 for both. We then began connecting the nodes belonging to the two communities through an inter-community edge probability of *p*_{
int
}≪*p*_{
E
}; we increased *p*_{
int
} all the way to *p*_{
E
} to simulate the distinct communities merging. Once the merger was complete, we gradually decreased *p*_{
int
} to simulate the splitting of a large community into smaller ones.

To test our methods on real-world data, we used proteomics data produced by Mitchell et al. (2013). Networks were produced by identifying subnetworks of upregulated proteins (*p*<0.05 and fold change > 1.5 compared to uninfected mocks) from the overall human protein-protein interaction network (Keshava Prasad et al. 2008). The network data indicates time-varying linkages between different proteins in human lung epithelial cells that have been infected by the Severe Acute Respiratory Syndrome corona virus (SARS-CoV). The proteomics network formed a relatively sparse, highly disconnected graph of 576 nodes, and we used the data snapshots at *t*=24,30,36,48,54,60, and 72, where *t* is the number of post-infection hours. Because this graph is disconnected (and severely so), we use the bias approach described in the “Disconnected graphs” section.

We implemented our methods in Python, making particular use of the matrix exponential and logarithm functions in the SciPy package. To evaluate the interpolation and averaging results for the synthetic network, we recorded connectivity measurements, spectral snapshots from interpolated and averaged Laplacians, and the total number of communities in the interpolated and averaged Laplacians. To measure connectivity, we used the logarithm (for scaling purposes) of the product of the non-zero Laplacian eigenvalues as mentioned in the “Graph interpolation and averaging” section. For the spectral snapshots, we used the eigenvectors corresponding to the first two non-trivial eigenvalue to produce plots as described in the “Dynamic spectral clustering” section. These snapshots provided an evaluation that was more qualitative than quantitative. We then used the Louvain method to perform community detection. The graph snapshots are provided in Additional file 1, and the code implementing the methods is provided in Additional file 2.

The spectral snapshots and connectivity measurements were not as useful for the proteomics network because the proteomics network was highly disconnected, but the Louvain method was still applicable for community detection. To investigate the interpolation and averaging of community structure for this network, we tracked the total number of communities, the total number of communities with at least five members, community similarity, and graph energy. Because the network was highly disconnected, the Louvain method produced many small or single-member communities. Tracking the number of communities above a certain size helped to reduce the amount of noise due to that effect. By community similarity, we mean not just the number of communities but the composition of those communities as well. It can be difficult to measure the degree of similarity between two graphs’ community structures when there are many communities and the community labelling is not consistent, but we can look at the pairwise similarity with the Rand index (Rand 1971).

The Rand index works by using a baseline or ground truth case, considering every distinct pair of nodes, and determining whether or not they are in the same community. It then looks at these same pairs in another graph of interest. If, for a given pair of nodes, the nodes are either in the same community as each other in both graphs or not in the same community as each other in both graphs, that pair gets a score of 1; otherwise they get a score of 0, indicating a dissimilarity between the community structures of the two graphs. Summing the results over all pairs and dividing by the number of pairs yields a score between 0 and 1, where 1 indicates that the two graph’s community structures are identical. The smaller the value, the less similar the structures are.

*E*, is defined as the sum of the absolute values of the eigenvalues of the adjacency matrix. Given that it is bounded by the number of edges,

*m*, in an unweighted graph (Brualdi 2006), we can also use it to bound the number of edges:

and thus it gives us information about both graph spectra and graph connectivity.

For both sets of data, we used thresholding on the edge weights to get unweighted graph equivalents. This procedure, and especially the threshold value used, was more impactful on the proteomics data than on the synthetic data.

### Synthetic graph results

Additional file 3: A video of the spectral plots created with the AI geodesic interpolation on the synthetic graph data progressing through the data snapshots in order from the initial to the final frame. (MP4 1966 kb)

*t*=0,

*t*=3, and

*t*=6, we can see the distinction more clearly, as in Fig. 5.

Thresholding gives us a piecewise constant graph. The graph dynamics consist of an edge addition phase followed by an edge subtraction phase, so the thresholding parameter simply determines when that entry flips from 0 to 1 (or vice versa). If we were to use a finer time resolution, we might see a slight difference between the linear and geodesic interpolations with respect to when this transition happens, but the basic behaviour would remain the same.

In performing community detection, we found that the geodesic interpolations produced adjacency matrices with negative entries. Almost all of these entries were on the order of 0.001 to 0.01, and none were larger than 0.1. Negative edges need not be a barrier for community detection (e.g., see Traag and Bruggeman (2009)), but they can cause problems for the Louvain method, so in doing community detection, we simply set these entries to 0. This was only necessary for community detection on graphs that did not use thresholding. When using a threshold, any value equal to or below the threshold, including a negative value, was set to 0.

*t*=1 and

*t*=2. When we use a threshold, these behaviours cease, as we now have a graph that is piecewise constant in time for all three interpolations.

Mean graph characteristics, synthetic graphs

Mean graph | Modularity | Connectivity |
---|---|---|

Linear | 0.299 | 633.5 |

Linear (thresholded) | 0.300 | 631.8 |

AI geodesic | 0.335 | 631.3 |

AI geodesic (thresholded) | 0.305 | 630.5 |

LE geodesic | 0.336 | 631.3 |

LE geodesic (thresholded) | 0.304 | 630.7 |

### Proteomics network results

In interpolating the proteomics network data, we again obtained negative adjacency matrix entries (around 5% of the total entries). The AI geodesics produced far fewer such entries than the LE geodesics (by an order of magnitude), and the AI entries were usually smaller. Of the negative entries, the largest was -0.16, but less than 1% of the negative entries had magnitudes greater than 0.01. As with the synthetic graphs, we simply set these negative entries to 0 when using the Louvain method.

The case without thresholds was more interesting. There, the linear interpolation still often produced too many communities, but the geodesic results did not uniformly produce too few communities. The LE geodesic may have been slightly better than the AI geodesic, but they were both still producing results that looked much more reasonable than they had when we plotted the total number of communities. In fact, those results look even more regular and smooth than the thresholded results.

*A*

_{0}to adjacency matrix

*A*

_{1}. Let us denote the edges in

*A*

_{1}that are not in

*A*

_{0}with the adjacency matrix

*A*

_{ add }and the edges in

*A*

_{0}that are not in

*A*

_{1}with the adjacency matrix

*A*

_{ sub }. Our linear interpolation from

*A*

_{0}at

*t*=0 to

*A*

_{1}at

*t*=1 would then be

*τ*such that matrix entries greater than

*τ*are sent to 1 and entries less than or equal to

*τ*are sent to 0, we get two possible interpolation patterns, each with three interpolated values. If

*τ*<0.5, then

*τ*≥0.5, then

*A*_{0}−*A*_{
sub
} will be less connected than either of the interpolation end points, and if *τ*=0.5, then *A*(*t*)=*A*_{0}−*A*_{
sub
} only at *t*=0.5. That is why we see that spike in the number of communities.

When we look at the thresholded results, we see that the linear interpolation consistently produces graphs with high energy values, the AI geodesic produces graphs with low energy values, and the LE geodesic is somewhere in the middle. For the LE geodesic, the best threshold value is around 0.02, where the interpolation produces a relatively steady change in graph energy from data point to data point (unlike the linear and AI geodesic interpolations, which basically plateau between points). This is consistent with what we saw in Fig. 8 and what we know about sparsity and the different interpolations.

Number of communities in average graph

Threshold | ||||
---|---|---|---|---|

Interpolation type | 0.02 | 0.1 | 0.5 | None |

Linear | 77 | 77 | 439 | 78 |

AI geodesic | 538 | 539 | 569 | 82 |

LE geodesic | 495 | 530 | 565 | 75 |

Number of communities with ≥ 5 members in average graph

Threshold | ||||
---|---|---|---|---|

Interpolation type | 0.02 | 0.1 | 0.5 | None |

Linear | 19 | 19 | 7 | 19 |

AI geodesic | 3 | 3 | 0 | 16 |

LE geodesic | 7 | 4 | 1 | 14 |

Table 3 records results congruent with those in Table 2. With the linear mean graph, we see more communities with at least five members than any of the individual graph snapshots have – again, the linear interpolation produces results with increased connectivity. The Riemannian mean graphs without thresholds produce more reasonable numbers of communities, but applying a threshold to the geodesic means severely reduces those numbers. The most reasonable result with a threshold seems to be the LE mean with a threshold of 0.02 or the linear mean with a threshold of 0.5.

Average similarity in community assignment

Threshold | ||||
---|---|---|---|---|

Interpolation type | 0.02 | 0.1 | 0.5 | None |

Linear | 0.948 | 0.948 | 0.989 | 0.945 |

AI geodesic | 0.990 | 0.990 | 0.990 | 0.932 |

LE geodesic | 0.989 | 0.990 | 0.990 | 0.905 |

Graph energy in average graph

Threshold | ||||
---|---|---|---|---|

Interpolation type | 0.02 | 0.1 | 0.5 | None |

Linear | 647.2 | 647.2 | 194.2 | 229.6 |

AI geodesic | 55.9 | 49.1 | 11.5 | 19.2 |

LE geodesic | 93.7 | 55.9 | 14.9 | 25.2 |

In concluding our observations about these averages, we note that the weights on the linear average graph will all have weights that are multiples of 1/7 (because there are seven data points provided), and thus there will be no difference in results for any two thresholds that lie between \(\frac {n}{7}\) and \(\frac {n+1}{7}\). This explains why the results for threshold values of 0.02 and 0.1 are the same for the linear average, for example. The geodesic interpolations provide no such structure, and our results here would suggest that low thresholds are generally required to get good results out of the geodesic interpolations.

## Discussion and future work

### Interpolation error

In the Appendix, we have provided error bounds for each geodesic interpolation in terms of distance on their respective manifolds. The actual error incurred will depend on the problem in question, though. That kind of error, or even entry-wise error, may not be the most important kind of error to consider for our purposes here, however. Rather, we may care most about the community structure.

Based on our community-related metrics (connectivity and similarity), the LE geodesic, with a threshold for the proteomics data, performed the best. The AI geodesic was too sparse and disconnected, while the linear interpolation was too connected (as expected). The optimal choice of threshold value depended on the metric being considered: 0.1 was by far the best when considering community similarity, but 0.02 was better for the other metrics under consideration. In general, the optimal threshold value will likely depend on the problem in question and the quantities of interest, but we found that the LE geodesic responded to changes in the threshold value more readily than the AI geodesic did.

^{−6}), but as mentioned in the “Disconnected graphs” section, increasing the bias value caused the geodesic interpolation to approach the linear one. Figure 12 shows an example of this where increasing the bias term causes the LE geodesic to behave more and more like the linear interpolation (compare with Fig. 11). Future work may involve experimenting with different bias terms to find a happy medium between the linear and pure geodesic interpolations.

One concern about the geodesic interpolations is the transient edges that they produce– edges that do not exist in either end point but emerge and disappear during the interpolation process. The weights on these edges were small, but they could be positive or negative, and they arose in both the synthetic and proteomics data, so they are not simply an artefact of using the bias addition approach to deal with disconnected graphs. Moreover, using a low threshold means that some of these edges may not disappear when that threshold is applied, and therefore they may affect the community structure of the graph. Using a larger bias value to more closely approximate a linear interpolation may ameliorate the problem, but it would be valuable to look in more detail at why these transients occur and how to interpret them from a graph theoretic perspective. For example, does it make sense to say that the ‘shortest’ or ‘least energetic’ path from one graph snapshot to another might involve some transient edges? From the perspective of the manifold geometry, it clearly does, as the shortest path between two points *is* a geodesic, but it is not clear if the same holds true purely from a network perspective.

In short, the geodesic interpolations are not perfect, and there are still unanswered questions, but it is nonetheless clear that linear interpolation is not well suited to graph interpolation if the ultimate goal is community detection. When using a threshold, linear interpolation will always produce a piecewise constant result consisting of three phases. Without thresholding, the linear interpolation inflates overall graph connectivity, and the greater the difference between the two graphs, the greater the inflation. As an extreme example, consider interpolating between a graph with adjacency matrix *A* to a graph with adjacency matrix 1−*A*. The result ‘halfway’ between them would be a fully connected graph with edge weights of 0.5. These issues are particularly prominent when calculating averages over multiple graphs.

Perhaps most saliently for our purposes here, the linear interpolation did not produce steady changes in the community structure between data points – the proteomics data showed that the linear interpolation almost always had markedly fewer communities than the data points it connected. The AI geodesic produced transient edges that were smaller in magnitude and fewer in number than the LE geodesic, but it was also more expensive and produced graphs that were too sparse (e.g., too few communities); the LE geodesic used a similar approach but produced better results when combined with a threshold. Similar trends held true, generally speaking, for the mean graphs as well.

### Computational cost and supporting methods

Currently, the computational cost of geodesic interpolation is high because it requires calculating matrix functions like the exponential and logarithm. The LE geodesic is noticeably faster than the AI geodesic in calculating interpolated points, though, due to the fractional matrix powers used in the former but not the latter. Furthermore, the average graph is significantly easier to calculate for the LE geodesic because it has a closed-form expression, whereas the AI geodesic requires an iterated numerical solution. These computational costs are not prohibitive for graphs with hundreds of nodes, but for much larger graphs – say, on the order of 10^{6} nodes – the computational cost could render our methods infeasible. One possible approach would be to project the graph Laplacians to a lower-dimensional space, perform the interpolation there, and then project back to the original space with some kind of low-rank or sparsity criterion; Riemannian optimization on matrix manifolds could be useful for determining an optimal low-rank projection (Vandereycken 2013).

Another option would be to use graph spectral sparsification (Batson et al. 2013) to produce sparse graphs that approximate the spectrum of the original graph. We would then perform the interpolation on those sparse graphs. Given the close relationship between geodesics and spectral properties, this approach may be better-suited to the geodesic interpolations than to the linear interpolation. Either way, it should be possible to come up with an error bound, in terms of the distance between the approximate and true solutions, that relates to the approximation used.

As an alternative to thresholding, it may also be possible to identify the Laplacians of unweighted graphs that are ‘closest’ to the geodesic trajectory and use them to define a kind of discrete trajectory of unweighted graphs that most closely approximates the geodesic between two unweighted graphs. This could potentially be more accurate than simply thresholding the adjacency matrix entries.

### Additional interpolation and clustering methods

Our present interpolation methods match the supplied data points exactly, but the transitions from one interpolation to another are not smooth. It may be valuable to develop more sophisticated interpolation methods that will enforce smoothness, such as polynomial and spline interpolation, using the form of the geodesic interpolations. We may not want to match the supplied graph snapshots exactly, though. Instead, we may need to come up with an approximating curve for noisy data. It is possible to define a geodesic that minimizes the sum of squared distances between it and a set of time-indexed data (much like a linear least-squares regression). We could then solve for the regression coefficients in a manner similar to the calculation of the geodesic mean. Both higher-order interpolations and least-squares interpolations are possible for the AI and LE geodesics, but they may be easier to derive and computationally cheaper for the LE versions than the AI versions. Regardless of which is used, though, the geometries in which the interpolations are embedded would ensure that the Laplacians remain positive-semidefinite and thus representative of real graphs.

There is also the option of using other Laplacians (e.g., a normalized Laplacian). Some of these Laplacians have spectral properties, such as bounded eigenvalues, that may induce better interpolation behaviour. If these Laplacians also have non-constant nullspaces, though, that would add complexity to the interpolation procedure. This would not be a significant hurdle for piecewise geodesic interpolation, but it may be problematic for graph averaging and some of the interpolation expansions described in the paragraph above. We have not yet looked at this problem in detail, however.

Finally, as mentioned previously, the Riemannian framework does not require any one particular community detection method, though it may have some natural connections to spectral clustering. Future work with the framework could include comparing different static clustering methods (either analytically or computationally) to see if there are any that would be particularly well- or ill-suited to this kind of interpolation and averaging.

## Conclusions

We described and implemented Riemannian methods for interpolating between and averaging dynamic graph snapshots. Following that, we demonstrated the use of these methods on a synthetically generated dynamic graph and an experimentally produced proteomics network and compared them with entry-wise linear interpolation. The linear interpolation increased graph connectivity between interpolation points, and we showed that when a threshold is used to produce unweighted graphs from the interpolation, the entry-wise linear approach will always produce a three-phase piecewise constant result.

The geodesic interpolations created using the Riemannian methods produced graphs with linearly varying connectivity when applied to connected graph snapshots and produced decreased connectivity between interpolation points when applied to disconnected graph snapshots. We found that using a low threshold on the edge weights improved our results on the disconnected graphs. However, these interpolations produced transient edges (with small positive and negative weights). One area of future work will be to investigate why this behaviour occurs and interpret it in graph theoretic terms. Choosing larger bias values when applying these methods to disconnected graphs may improve the quality of the interpolation, from the perspective of graph connectivity, and it may also reduce the presence of transient edges as well.

Other significant next steps for this work include developing techniques for applying our work to significantly larger graphs and expanding upon our current interpolation methods to produce the Riemannian analogues of polynomial interpolation, spline interpolation, and least-squares regression.

## Appendix

### Error estimate calculations

*f*(

*x*) from

*x*

_{0}to

*x*

_{1}has an error bound of

**x**(

*t*) and its approximation

**y**(

*t*),

*t*∈[0,1], from which we can calculate an error

**z**(

*t*):

We cannot say that a linear interpolation will always have the smallest amount of error, but a linear interpolation would have \(\ddot {\mathbf {y}} = 0\), so we would expect it to have a smaller error bound than an arbitrary nonlinear interpolation (i.e., one not using higher-order derivative information).

#### AI Geodesic

*X*(

*t*) through the positive-definite subspace of the Laplacian and an AI geodesic interpolation

*Y*(

*t*) between

*X*(0)=

*R*

_{0}and

*X*(1)=

*R*

_{1}:

*Ω*and

*Ψ*commute with each other and with powers of each other (including negative powers) because they have the same eigenvectors. Traces of matrix products are also constant under cyclic permutations of those products. We will use these properties to derive an expression for \(\text {tr}\left (\Omega \dot {\Omega }\right)\) using this matrix commutivity and Greene’s results on traces of matrix products (Greene 2014):

*Ψ*

^{−1}and

*Ω*commute,

*C*(and its powers) commute with exp(

*C*

*t*). Therefore, \(\dot {Y}_{g} Y^{-1}_{g}\) is constant in time. For the entry-wise linear interpolation, however, there is no closed-form expression for \(\dot {Y}_{l}\) or \(Y^{-1}_{l}\);

*Y*

_{ l }is the positive-definite component of the interpolated Laplacian. In general, though, \(\dot {Y}_{l} Y^{-1}_{l}\) will

*not*be constant in time. We can then consider the second derivative of the original distance function:

For the geodesic, \(\frac {d}{dt} \left (\dot {Y}_{g} Y^{-1}_{g}\right) = 0\), but this will not be the case for the entry-wise linear interpolation. Also note the recurrent \(X^{-\frac {1}{2}} \Omega X^{-\frac {1}{2}}\) term: \(X^{\frac {1}{2}} \Omega X^{\frac {1}{2}}\) is the vector from *X* to *Y* (Pennec et al. 2006), so \(X^{-\frac {1}{2}} \Omega X^{-\frac {1}{2}}\) is essentially a measure of trajectory discrepancy *rescaled by**X*.

*R*

_{1}to

*t*>1 using an AI geodesic built by interpolating from

*R*

_{0}to

*R*

_{1}would produce the following error bound:

with the derivatives as previously calculated.

#### LE geodesic

Several of the terms in \(\frac {d}{dt} \left (\text {tr} \left (\Omega \dot {\Omega }\right)\right)\) will therefore be zero for the LE geodesic. As such, we would expect the error from the LE geodesic to be less than the error from the entry-wise input interpolation for the same reasons that we would expect the AI geodesic error to be smaller than the entry-wise linear interpolation. We can then plug these results into Eq. 40 to get error bounds for the LE geodesic.

## Declarations

### Acknowledgements

The authors would like to thank Jason McDermott for providing the proteomics data used in this study.

### Funding

This work was funded by the Microbiomes in Transition (MinT) Initiative at the Pacific Northwest National Laboratory.

### Availability of data and materials

The datasets supporting the conclusions of this article are included in the article’s additional files.

### Authors’ contributions

CB proposed the Riemannian framework, performed the mathematical derivations, implemented the methods, generated results, and wrote the main body of the paper. MH obtained the proteomics data, provided the community similarity metric, co-wrote the literature review, offered comments and corrections on the manuscript, and suggested reviewers. AS proposed the bias method, wrote code to assist in the community detection, co-wrote the literature review, generated the synthetic data, offered comments and corrections on the manuscript, and suggested reviewers. All authors have read and approved the manuscript.

### Competing interests

The authors declare that they have no competing interests.

### Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Authors’ Affiliations

## References

- Absil, PA, Mahony R, Sepulchre R (2007) Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton, New Jersey.MATHGoogle Scholar
- Arsigny, V, Fillard P, Pennec X, Ayache N (2007) Geometric means in a novel vector space structure on symmetric positive-definite matrices. SIAM J Matrix Anal Appl 29(1):328–347.View ArticleMATHGoogle Scholar
- Batson, J, Spielman DA, Srivastava N, Ten SH (2013) Spectral sparsification of graphs: Theory and algorithms. Commun ACM 56:87–94.View ArticleGoogle Scholar
- Blondel, V, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp. P10008. http://iopscience.iop.org/article/10.1088/1742-5468/2008/10/P10008/pdf.
- Boccaletti, S, Latora V, Moreno Y, Chavez M, Hwang DU (2006) Compex networks, structure and dynamics. Phys Rep 424:175–308.ADSMathSciNetView ArticleMATHGoogle Scholar
- Boguná, M, Papadopoulos F, Krioukov D (2010) Sustaining the internet with hyperbolic mapping. Nat Commun 1:62.ADSView ArticleGoogle Scholar
- Bonnabel, S, Sepulchre R (2009) Riemannian metric and geometric mean for positive semidefinite matrices of fixed rank. SIAM J Matrix Anal Appl 31:1055–1070.MathSciNetView ArticleMATHGoogle Scholar
- Boothby, WM (1986) An Introduction to Differentiable Manifolds and Riemannian Geometry. Academic Press, Orlando.MATHGoogle Scholar
- Brualdi, RA (2006) Energy of a graph In: Notes to AIM Workshop on Spectra of Families of Matrices Described by Graphs, Digraphs, and Sign Patterns.Google Scholar
- Cazabet, R, Amblard F (2014) Dynamic Community Detection. In: Alhajj R Rokne J (eds)Encyclopedia of Social Network Analysis and Mining, 404–414.. Springer, New York.Google Scholar
- Domokos, G, Sipos AR, Szabó T (2012) The mechanics of rocking stones:equilibria of separated scales. Math Geosci 44:71–89.View ArticleGoogle Scholar
- Fenn, DJ, Porter MA, Mucha PJ, McDonald M, Williams S, Johnson NF, Jones NS (2012) Dynamical clustering of exchange rates. Quant Finan 12(10):1493–1520.MathSciNetView ArticleMATHGoogle Scholar
- Fortunato, S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174.ADSMathSciNetView ArticleGoogle Scholar
- Girvan, M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826.ADSMathSciNetView ArticleMATHGoogle Scholar
- Greene, J (2014) Traces of matrix products. Electron J Matrix Algebra. 27:716–734.MathSciNetMATHGoogle Scholar
- Harris, JM, Hirst JL, Mossinghoff M (2008) Combinatorics and Graph Theory. Springer, New York.View ArticleMATHGoogle Scholar
- Keshava Prasad, T, Goel R, Kandasamy K, Keerthikumar S, Kumar S, Mathivanan S, Telikicherla D, Raju R, Shafreen B, Venugopal A, et al. (2008) Human protein reference database—2009 update. Nucleic acids Res 37(suppl_1):767–772.Google Scholar
- Krioukov, D, Papadopoulos F, Vahdat A, Boguñá M (2009) Curvature and temperature of complex networks. Phys Rev E 80:035101.ADSView ArticleGoogle Scholar
- Krioukov, D, Papadopoulos F, Kitsak M, Vahdat A, Boguñá M (2010) Hyperbolic geometry of complex networks. Phys Rev E 82:036106.ADSMathSciNetView ArticleGoogle Scholar
- Lambiotte, R, Delvenne JC, Barahona M (2014) Random walks, markov processes and the multiscale modular organization of complex networks. IEEE Trans Netw Sci Eng 1(2):76–90.MathSciNetView ArticleGoogle Scholar
- Mitchell, HD, Eisfeld AJ, Sims AC, McDermott JE, Matzke MM, Webb-Robertson BJM, Tilton SC, Tchitchek N, Josset L, Li C, et al. (2013) A network integration approach to predict conserved regulators related to pathogenicity of influenza and sars-cov respiratory viruses. PLoS ONE 8(7):69374.ADSView ArticleGoogle Scholar
- Mucha, PJ, Richardson T, Macon K, Porter MA, Onnela JP (2010) Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980):876–878.ADSMathSciNetView ArticleMATHGoogle Scholar
- Newman, MEJ (2010) Networks: An Introduction. Oxford University Press, Oxford, United Kingdom.View ArticleMATHGoogle Scholar
- Nguyen, NP, Dinh TN, Shen Y, Thai MT (2014) Dynamic social community detection and its applications. Plos ONE 9(4):1–18.Google Scholar
- Pennec, X, Fillard P, Ayache N (2006) A riemannian framework for tensor computing. Int J Comput Vis 66:41–66.View ArticleMATHGoogle Scholar
- Rand, WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850.View ArticleGoogle Scholar
- Tantipathananandh, C, Berger-Wolf TY (2011) Finding communities in dynamic social networks In: 2011 IEEE 11th International Conference on Data Mining, 1236–1241.. IEEE, Vancouver.View ArticleGoogle Scholar
- Traag, VA, Bruggeman J (2009) Community detection in networks with positive and negative links. Phys Rev E 80(3):036115.ADSView ArticleGoogle Scholar
- Vandereycken, B (2013) Low-rank matrix completion by riemannian optimization. SIAM J Optim 23:1214–1236.MathSciNetView ArticleMATHGoogle Scholar