Path homologies of motifs and temporal network representations

Path homology is a powerful method for attaching algebraic invariants to digraphs. While there have been growing theoretical developments on the algebro-topological framework surrounding path homology, bona fide applications to the study of complex networks have remained stagnant. We address this gap by presenting an algorithm for path homology that combines efficient pruning and indexing techniques and using it to topologically analyze a variety of real-world complex temporal networks. A crucial step in our analysis is the complete characterization of path homologies of certain families of small digraphs that appear as subgraphs in these complex networks. These families include all digraphs, directed acyclic graphs, and undirected graphs up to certain numbers of vertices, as well as some specially constructed cases. Using information from this analysis, we identify small digraphs contributing to path homology in dimension two for three temporal networks in an aggregated representation and relate these digraphs to network behavior. We then investigate alternative temporal network representations and identify complementary subgraphs as well as behavior that is preserved across representations. We conclude that path homology provides insight into temporal network structure, and in turn, emergent structures in temporal networks provide us with new subgraphs having interesting path homology.

structures (Carlsson 2009). Moreover, such methods have been applied on network datasets to discover global structures that could not be easily obtained through standard, statistical mechanics-driven approaches to the study of complex networks (Petri et al. 2013). Crucial to these discoveries was the early development of computational libraries (Adams et al. 2014) that enabled network scientists to rapidly test new mathematical tools and discover new questions that may be answered using such methods.
The particular flavor of homology that is currently best known across disciplines is simplicial homology (Giusti et al. 2016). Here the basic objects are simplices-points, edges, triangles, tetrahedra, and so on-that encode relationships beyond the dyadic relationships captured by the edges of a graph. These simplices assemble together to form a structure called a simplicial complex, which in turn provides a principled method for representing complex shapes (Edelsbrunner and Harer 2010). Simplicial homology then searches for the presence of holes or cavities in the network that may signify regions where small subgroups of nodes participate in correlated activity, but without overall consensus in the region of interest (Sizemore et al. 2019).
A caveat, however, is that simplicial homology is not immediately compatible with directed graphs (digraphs). In the linear-algebraic representation central to simplicial homology, directed edges of the form a → b and b → a are assigned to the same vector subspace, thus leading to a potential loss of information (Chowdhury and Mémoli 2018) (e.g. an email from an employee to a superior is semantically different from an email from a superior to the employee). A remedy to this situation is obtained via the notion of path homology, a version of homology defined on directed graphs that was developed in Grigor'yan et al. (2014a) and associated works (Grigor'yan et al. 2012(Grigor'yan et al. , 2014b(Grigor'yan et al. , 2015(Grigor'yan et al. , 2017(Grigor'yan et al. , 2018a. Path homology resolves the (a → b)-vs-(b → a) situation by assigning different vector subspaces to each direction. Moreover, path homology generalizes simplicial homology as follows. Given a simplicial complex S representing a shape, there is a natural ordering from lower to higher order simplices, given by the subset relation σ ⊆ τ for simplices σ , τ (e.g. a point belonging to an edge, or an edge belonging to a triangle). A digraph G S can then be constructed by taking the simplices as nodes, and directed edges σ → τ representing inclusion relations. Then the path homology of G S is naturally isomorphic to the simplicial homology of the original complex S (Grigor'yan et al. 2014a). In this sense, path homology generalizes simplicial homology. Additionally, path homology serves as a general framework for computing global structures in arbitrary digraphs via linear algebra. The tradeoff for this added generality is that the intuition for path homology can be more involved than finding non-local loops and cavities in a network. However, a useful (if imperfect) interpretation is that path homology measures the consistency and robustness of directional flow in a digraph. Specifically, in a digraph with the architecture of a multilayer perceptron (MLP)-i.e. layers of nodes with unidirectional edges across consecutive layers, also referred to as a deep feedforward neural network-path homology is positively related to both the width of the layers (robustness) and the agreement of edge directions (consistency) .
Towards grounding the preceding discussion in a concrete application, consider the problem of examining control flow in computer programs. Control flow of code refers to the order in which information is passed among variables to carry out operations, and it can naturally be represented by a graphical structure that can in turn by quantified by various metrics to predict defects and points of failure. One popular metric is cyclomatic complexity (McCabe 1976;Ebert et al. 2016), which measures the number of linearly independent paths in the control flow graph. More explicitly, cyclomatic complexity is calculated via the simplicial homology of a graph. It has been shown that path homology is a stronger analogue of cyclomatic complexity that benefits by capturing the natural directionality of control flow graphs that is ignored by cyclomatic complexity (Huntsman 2020).
For further applications, multiscale versions of path homology have been developed in , Dey et al. (2020), Lin et al. (2019). In sum, however, the empirical application of path homology to large-scale data structured as digraphs remains largely unexplored. The bottleneck for such exploration is the lack of methodological developments at different stages of the pipeline, including algorithmic development, implementation, application to real-world networks, and posthoc analysis of the insights contributed by path homology.
To help bridge this perceptual gap between algebraic and combinatorial structure in service of analyzing real-world networks, we elaborate on the conference paper . After reviewing the basics of path homology in §2, we present an algorithm and implementation for computing path homology in arbitrary dimension in §3. In § 4, we then use this algorithm to compute the path homologies of (1) all digraphs on ≤ 4 vertices, (2) all directed acyclic graphs on ≤ 6 vertices, (3) all undirected graphs on ≤ 6 vertices, (4) Erdős-Rényi random graphs, and (5) small digraphs that exhibit torsion. These examples 1 help develop intuition about path homology and yield digraph families whose path homology has surprising behavior. We then use this algorithm in §5-7 to compute path homologies of digraphs that represent the same three real-world temporal networks in complementary ways. Here we identify salient subgraphs with exemplars that appear in § 4 and relate them to broader network behavior within a given representation. In §8 we perform additional analysis on these real-world temporal networks using popular existing network measures such as density and clustering coefficient to highlight differences from path homology. Finally, we discuss gross network behaviors that are preserved across representations and make concluding remarks in § 9.

Related literature
Parallel to path homology and its intermediate constructions, there have been recent developments in extending notions related to standard (i.e. simplicial) homology to account for directionality in the graph setting. Such advances started with the use of directed flag complexes in Reimann et al. (2017) and continued with theoretical, algorithmic, and empirical developments in Turner (2019), Lütgehetmann et al. (2020), Gebhart and Funk (2020). Further developments of related ideas have appeared in , Méndez and Sánchez-García (2020), Bergomi et al. (2020).

Path homology
Here we sketch the basic elements of path homology as treated in Grigor'yan et al. (2012), . Although our development is nominally selfcontained, a reader who wants background in topology (e.g. the foundational theory of simplicial homology that shares many similarities with path homology) is commended to Ghrist (2014), Hatcher (2001. Let X be a finite set, and let F be a field. A standard algebraic construction is the free F-vector space on X, i.e. a vector space with standard basis {e x : x ∈ X} . We denote this space by F X ∼ = F |X| , and additionally set F ∅ := {0} . Next let D = (V , A) be a loopless digraph, and consider the sets The non-regular boundary operator ∂ [p] : F V p+1 → F V p is the linear map whose action on the standard basis is . A few lines of algebra focused on index bookkeeping shows that (Ghrist 2014) as depicted in Fig. 1. The "boundary of a boundary is zero" condition admits a topological interpretation: as an example, consider that a 2-dimensional disk has boundary given by a circle, but the circle itself has no boundary.
Any chain complex (C p , ∂ p ) gives rise to an algebraic invariant called homology. The invariance property of homology is that it behaves nicely with respect to maps on the chain complex that are induced by an underlying transformation of a common structure (here, a digraph). Writing Z p := ker ∂ p and B p := im ∂ p+1 , i.e. the kernel and image of the boundary maps, the dimension p homology of the chain complex (C p , ∂ p ) is the quotient (2) H p := Z p /B p . Fig. 1 Schematic of a chain complex. Here C p , B p−1 , and Z p are respectively the domain, codomain, and kernel of ∂ p , which conspire to make the homology H p := Z p /B p well defined H p is a finitely generated abelian group, and therefore has the form Z β p ⊕ T p , where the torsion T p is a finite abelian group. The topological interpretation of this construction is that the Z β p and T p terms correspond to "voids" and "twistedness" in a topological space, respectively. When computed over a field F , the C p are vector spaces and the torsion is zero. In this setting the Betti numbers β p := dim H p = dim Z p − dim B p completely characterize homology up to isomorphism.
Returning to the loopless digraph D, consider the set A p (D) of allowed p-paths: The (non-regular) path complex of D is accordingly defined to be the chain complex (� p , ∂ p ) , where ∂ p := ∂ [p] | � p . 3 The (non-regular) path homology of D is just the homology of the path complex (� p , ∂ p ).
(3) The trivial path homology of D 2 can be attributed to w and z appearing as "bottlenecks" that prevent robustness of the flow. This intuition is formalized by results in Chowdhury et al. (2019) For convenience, henceforth we generally replace the path complex (� p , ∂ p ) with its reduction Assuming the original complex is nondegenerate and using an obvious notational device, this has the minor effect where the Kronecker delta δ jk := 1 iff j = k and δ jk := 0 otherwise.

Path homologies of the mutual dyad subgraphs
As another example with practical relevance that will be exhibited in § 5, we use path homology to characterize a family of network motifs that we call the n-uplinked mutual dyads-or dually, the n-downlinked mutual dyads-in reference to the original terminology from Milo et al. (2002). Given an integer n ≥ 1 , the n-uplinked mutual dyad W n is a digraph with vertex set {a, b, 1, 2, . . . , n} and edge set Fig. 8. The n-downlinked mutual dyad is obtained by reversing all the arcs (cf. Fig. 7). From the lens of path homology, the number of uplinks (resp. downlinks) contributes robustness to an upward (resp. downward) flow. This intuition is formalized as:  a) , and so e (a,b) + e (b,a) cannot contribute to β 1 . Similarly terms of the form e (a,b) + e (b,i) − e (a,i) ∈ Z 1 , 1 ≤ i ≤ n, cannot contribute to β 1 as they belong to B 1 , being the images of e (a,b,i) for 1 ≤ i ≤ n.
Next suppose p ≥ 3 : then p = {0} , since all the 3-paths have boundaries with nonallowed paths, and taking linear combinations does not eliminate these non-allowed paths.
. Therefore all 2-paths of the form e (a,b,i) + e (b,a,i) − e (a,b,j) − e (b,a,j) belong to Z 2 , but not to B 2 as 3 is trivial. Some linear algebra shows that the collection {e (a,b,1) + e (b,a,1) − e (a,b,j) − e (b,a,j) : 2 ≤ j ≤ n} is a basis for Z 2 . It follows that β 2 = n − 1.
To conclude the proof, we note that the above arguments hold for the downlinked mutual dyad by replacing terms of the form e (a,b,i) with e (i,a,b) .

Algorithm
To compute non-regular path homology, one needs to produce the requisite paths and perform the necessary linear algebra. However, performing this computation efficiently is a challenge that is addressed by our implementation, available at Yutin (2020). Our method introduces some nuances and is apparently among the first for dimension > 1, 4 so we outline our approach here.
First, for efficiency we remove nonbranching limbs (i.e., chains of vertices of total degree 2 that terminate in leaves of degree 1), since these do not affect homology by Theorem 5.1 of Grigor'yan et al. (2012). We also exploit Proposition 3.25 of Grigor'yan et al. (2012), by decomposing the graph into weak components before computing homology componentwise. For each component D, we extend an order on vertices V(D) to order paths lexicographically. We construct A p (D) for 0 ≤ p ≤ p max inductively. Starting from A 0 (D) = V (D) , we construct A p (D) by appending every vertex that has an arc from the terminal vertex of a path in A p−1 (D) . The paths are produced in lexicographical order for each p.
Next, using a radix-|V(D)| expansion, we compute the indices that specify the inclusion A p ֒→ V p+1 under lexicographical ordering. We then construct (in the standard basis) the matrix representation To produce p as efficiently as possible, we remove rows of ∇ [p,A] that are identically zero before computing this kernel. Once we have produced a matrix representation [p,A] for the kernel above, we get the chain boundary operator , the projection of ∂ [p,A] onto F A p−1 , projected onto and restricted to the invariant space).
Finally, we compute the homology of this chain complex, using the rank-nullity theorem and a singular value decomposition to compute the matrix ranks and in turn obtain the Betti numbers. We compute representatives for homology groups (without assurances that the representatives are the best possible) as cokernels of [ker ∂ p ] T ∂ p+1 . To find torsion over Z , we computed the Smith normal form of boundary matrices as in usual practice (cf. Fig. 5). 5 In the worst case, our algorithm requires O(|V | 2p+1 ) memory (to store the boundary matrices) and O(|V | (p+1)ω ) runtime (to compute matrix ranks), where the matrix multiplication complexity exponent ω is 3 in practice. In practice, these estimates are overly pessimistic (most digraphs analyzed in practice are very different from complete digraphs), though the memory requirements are still exponential in p unless the digraph being analyzed is acyclic, in which case the exponential relationship only holds up to the 4 Though (Shajii 2013;Slawinski 2013) significantly predate our implementation, we were unaware of these until after we had produced our implementation, and we could not find any published work drawing on them. 5 To optimize further, we could preprocess the digraph in accord with Theorem 5.7 of Grigor'yan et al. (2012), though this would raise its own issues in low dimension. Instead of performing a singular value decomposition on rather large boundary matrices to compute their ranks, we could recursively construct invariant spaces -each sub-path of an invariant path is itself an invariant path (since we only consider loopless digraphs). A simple approach in this vein might be to check every pair of paths in dimension p against every vertex to see where we can append 'triangles' and 'squares' in the sense of Grigor'yan et al. (2012). While promising, this approach generates too many paths, and reducing to a basis is computationally nontrivial. For low dimensions, we could also directly compute Betti numbers from the digraph itself (cf. Proposition 3.24 of Grigor'yan et al. (2012)).
length of the longest path. Nevertheless, it is still essential to limit computations to fairly small p and use natural filtrations (e.g., time, weight, etc.) to isolate portions of ambient digraphs. While it would be ideal in this vein to compute persistent path homology Lin et al. 2019;Dey et al. 2020), no practical algorithms for this are known in dimension > 1.

Small digraphs.
We show digraphs on four vertices with β p > 0 for p > 1 in Fig. 3. Surprisingly, four vertices are enough for nontrivial homology to occur even in dimension three. Meanwhile, in Fig. 4, the left panel shows directed acyclic graphs (DAGs) on six vertices with β 2 > 0 . Examining these DAGs led us to formulate and prove a conjecture about the path homology of DAGs that model the connectivity of deep feedforward neural networks (Chowdhury et al. 2019) and characterize temporal networks in the representation of Pósfai and Hövel (2014), as detailed in § 6. Finally, the right panel of Fig. 4 shows undirected graphs (which we treat as digraphs with arcs in both directions) on six vertices with β 2 > 0 . This highlights that path homology is also relevant for the analysis of undirected graphs. Torsion.
Though nominally defined over fields, path homology makes sense over rings, e.g. Z , with scarcely any modifications required. This is a more powerful invariant, as it gives rise to torsion. Yutin (2019) was able to identify the digraphs in Fig. 5 by sampling Erdős-Rényi digraphs and carefully decomposing an instance with nonzero torsion.
These digraphs are the smallest members of a family that we conjecture always exhibits torsion: larger members can be formed by taking a longer central unidirected closed path and linking each of its vertices to one of two external "polar" vertices (in an alternating fashion). Specifically, we conjecture that digraphs in this family with central paths of length 2n have torsion subgroups Z/nZ in H 1 . (We have computationally verified this conjecture for n ≤ 8 .) These digraphs seem to be analogues of socalled lens spaces that are formed by gluing two tori together with a twist, and which Fig. 3 (L) β 2 > 0 for these six (of 218 total) digraphs on four vertices. In each case it turns out that β p = δ p,2 . As a "suspension" of the 2-cycle, the digraph in the upper left can be thought of as an analogue of a homology 2-sphere obtained by gluing two cones along a common equator. (R) β 3 > 0 for these five digraphs on four vertices. In each case it turns out that β p = δ p,3 . Continuing the geometrical-topological analogy from before, the upper middle digraph in the right panel is akin to that in the upper left of the left panel with its "poles" glued together by a circular path in another dimension, thus giving rise to homology in dimension 3 are archetypal examples of spaces with torsion. While we do not pursue this analogy further in the current work, we note that such lens spaces have recently been used for nonlinear topological dimension reduction (Polanco and Perea 2019).
In Fig. 6 we show empirical distributions for Betti numbers of Erdős-Rényi random graphs (Frieze and Karoński 2016) on four nodes. Further knowledge of these distributions for different numbers of nodes would provide a useful method for testing if a stochastic digraph generating process could be described via an Erdős-Rényi model.

Applications to window-aggregated temporal networks
We analyze three temporal networks (Holme 2015  . The common "bow tie" motif here appears to be the cause for emergence of 2-homology in transportation networks as capacities are filtered (this will be elaborated on in future work); meanwhile, polygons with ≥ 5 sides have too many sides for paths in opposing directions to "destructively interfere. " That is, although these graphs are undirected, the directed paths of length 4 through them exhibit more coherence than in other graphs of the same size In this section, we represent the three DCNs above by aggregating the activity over a time window into a single digraph which then varies as the window moves. In later sections, we will examine the same three networks, but via different network representations. In both this and later sections, we exhibit high-order interactions identified using path homology that are respectively indicators of dilution, recurring motifs, and concentration within network behavior that is preserved across the various representations.

MathOverflow
The answer-to-question portion of the sx-mathoverflow DCN available at Leskovec and Krevl (2014)  In our analysis, we considered contacts within a time window of 24 hours, moving every eight hours. We aggregated contacts in a given window into a static digraph and computed the first three Betti numbers. 6 Fig. 6 Empirical distributions of β p (D 4,q ) , where D n,q is the Erdős-Rényi random digraph on n vertices with probability q of a given arc occuring 6 NB. Our path homology code removes any loops from digraphs.
Only two (immediately adjacent and overlapping) windows, over 13-14 Oct 2009, had β 2 > 0 . Inspecting the homology representatives revealed an underlying motif, viz., the 2-downlinked mutual dyad (cf. Sec. 2.1). The particular questions and answers involved are shown in Fig. 7, which also highlights the dyad. This (effectively) single occurrence of 2-homology happened very early in the history of MathOverflow-in fact, just two weeks after the website launched. As MathOverflow evolved, interactions on it also diluted. For example, most of the first 200 users asked and answered many fewer questions over time, while the overall size of and activity on MathOverflow grew much larger (not shown here). One consequence of this dilution of activity is that opportunities for tightly coupled patterns of questions and answers to occur diminished, with 2-homology serving as an indicator of this phenomenon.

An email network
The presence of linked dyads is actually ubiquitous and generalized in email networks, because of well-known behavior common to the medium (e.g., multiple people sending to a common mailing list). We isolated this behavior in our analysis of the email-Eu-core-temporal network available at Leskovec and Krevl (2014). This network has 986 vertices and 332334 directed temporal contacts; it spans 804 days of activity.
We considered contacts within a window of the most recent 100 contacts (emails), moving every 50 contacts. As before, we aggregated contacts in a given window into a static digraph and computed the first three Betti numbers. Many windows exhibited high values of β 2 that we traced to occurrences of the n-uplinked mutual dyad (cf. Sec. 2.1) motif shown in Fig. 8. The underlying dynamics is common: two people ("Alice" and "Bob") both send email to the same wide distribution and to each other.

A Facebook group
As a final example using the window-aggregated DCN representation, we considered the first 1000 days of activity on a Facebook group (Viswanath et al. 2009;Kunegis 2013) starting from 14 September 2004 and ending on 11 June 2007. We aggregated this DCN (13295 vertices; 187750 contacts) into daily digraphs with no sliding windows because it has a daily lull with virtually no activity. Fig. 9 shows the number of posts per day and the first three Betti numbers. Besides an obvious correlation between network activity and β 0 , it is also evident that progressively more and higher-dimensional homology classes appear over time. This emergent higher-order network structure indicates concentration of activity. Fig. 10 shows the first daily digraph with β 2 > 0 . Because this sort of concentration of activity is tied to just a few specific loci (i.e., appropriate homology representatives in the corresponding time windows), statistical or other straightforwardly quantitative network measures are unlikely to capture it de novo. Though it is probably the case that such measures can be reverse-engineered, this misses the point of a qualitative measure: i.e., robustness in the face of network perturbations or even (up to a point) differing representations.

Layered representations
A "layered" representation of DCNs that naturally leads to rich path homology characterizations is that of Pósfai and Hövel (2014). The essential idea is to represent a contact of the form (s, t, τ ) as an arc from (s, j) to (t, j + 1) where time is discretized into bins This representation casts DCNs in a structural light virtually identical to that of weight-filtered multilayer perceptrons (MLPs) as discussed in Chowdhury et al. (2019), where it was shown that homology generically occurs in dimension up to L − 1 , where L indicates the number of time bins or layers being considered at a given time. However, while trained weight-filtered MLPs exhibit high-dimensional path homology generically because neural activations propagate across layers by design, the equivalent phenomenology in DCNs signals potential propagation of information (or whatever the DCN models) that is highly correlated across time windows in a way that there is no a priori reason to expect.
Tellingly, the path homology of DCNs in this layered representation again indicates gross dynamics of activity dilution (for the MathOverflow DCN); time-invariance (for the email DCN); or concentration (for the Facebook DCN). High-dimensional homology occurs (if at all) only in rare episodes where network participants engage in highly correlated activity that has structural significance, e.g. co-clustering of question/answer behavior for MathOverflow; (presumably) organizational cliques for email; and joint self-and cross-posting for Facebook, respectively.
In detail, for each DCN in the previous section, we employ a time discretization into bins of duration δt , and a sliding window of duration d · δt , where d indicates the top dimension for which we compute β d and the window slides by δt . This alignment of dimension and temporal parameters ensures that we compute precisely the homologies which might be nontrivial. Here we use the preceding discussion on path homologies of MLPs, which suggests that for d layers, as in the sliding window of duration d · δt , we expect only trivial path homology in dimensions beyond d.
For the MathOverflow network (see Figs. 11 and 12), there is no evidence of 2-homology for reasonable choices of time discretization and window duration, so we set δt = 24 hr and d = 1 . Here β 1 is nonzero only transiently, shortly after the network began, in line with the general thrust of activity "dilution" mentioned previously. The absence of 2-homology can be attributed to an asymmetry between questioners and answerers: in any given window, these two sets do not overlap enough for the layers of the digraph representation to produce MLP-like subgraphs.
For the email network, we set δt = 1 hr and d = 2 . Here the network behavior is roughly time invariant (apart from a prolonged lull towards the end of the data) and several windows give rise to 2-homology ( Figs. 13 and 14). This is apparently due to shared to/cc lists in emails that are related in subject matter and localized in time.
For the Facebook network, we set δt = 24 hr and d = 2 . Here β 2 is nonzero only once, when two users self-post and interact with each other. Meanwhile, the increase of β 0 and β 1 (cf. Fig. 15) over time is consistent with the general thrust of activity "concentration" mentioned previously.

Temporal digraph representations
Yet another representation of DCNs is that of Cybenko and Huntsman (2019). Here, the entire network is losslessly encoded into a temporal digraph in which arcs are all either "spatial" (i.e., between vertices in the aggregated digraph) or "temporal" (i.e., connecting a vertex at one time to itself at a later time). A temporal digraph and the associated notion of a temporally coherent path are indicated in Fig. 16.
Our investigations suggest that temporal digraphs are uninteresting from the point of view of higher homology.

Conjecture 1
The temporal digraph of a DCN has β p = 0 for p > 1.
The idea of the conjecture (which is based more on intuition and extensive experimentation than exhaustive computation per se) is that there are no "diagonal" connections from a vertex at one time to a different vertex at a different time, which in turn constrains the algebra tightly enough that there are no linear combinations of p-paths of the kind required for nontrivial phenomenology with p ≥ 2 . For example, none of the digraphs in Fig. 3 can be subgraphs of the temporal digraph of a DCN, so any production of homology for p ≥ 2 must involve a subgraph on > 4 vertices. On the other hand, for p = 2 , the participating paths must be of length 3, and thus cannot involve more than four vertices each. In other words, higher homology would have to arise in an intrinsically distributed way. Although Conjecture 1 was formulated based on small, synthetically generated temporal digraphs, it was borne out in analyses of the same data sets-and using the same time windows-as in earlier sections. While at some points the resulting temporal digraphs were large enough to preclude computing β 2 due to the size of the boundary matrices involved, we always observed β 2 = 0 for temporal digraphs of DCNs. These results are illustrated in Figs. 17 and 18.   Fig. 12 In the layered representation of the MathOverflow DCN, there are three windows with β 1 > 20 . Here we depict the weak components contributing nontrivial homology in dimension one for each of these windows. Node labels indicate user IDs; subscripts indicate bins within the window. Essentially, the dynamics here are that groups of users answer multiple questions within a window in a way that overlaps. As the network matured and diluted, this tightly coupled behavior disappeared almost entirely We complement Conjecture 1 with a computationally supported conjecture about 1-homology in temporal digraphs with two (spatial) vertices.
Conjecture 2 Let C be a DCN with two vertices and contacts at times in [T] for T ∈ {3, 4, ...} . The maximum possible value of β 1 for the corresponding temporal digraph is T − 1 , achieved for the eight DCNs with contacts in alternating directions and possibly both directions at times 1 and T. If there are no contact pairs of the form (1, 2, t) and  Fig. 13 indicates, there are 18 windows in the layered representation of the email DCN with β 2 > 0 . Here, we show for each instance the corresponding subgraphs on vertices that participate in 2-homology representatives (it turns out in these particular instances that these subgraphs all have only a single weak component). These subgraphs are all very similar to the fully-connected deep feedforward networks (the difference here being that some of the layers are not fully connected, e.g. the graph in row 2, column 2) whose homology was analytically characterized in Chowdhury et al. (2019), and more generally are of the sort that would be encountered in weight-induced filtrations of multilayer perceptrons (2, 1, t) for t ∈ [T ] , then β 1 (for the temporal digraph representation) counts the number of times that contacts change direction. (1, 4, τ 1 ), (5, 4, τ 2 ), (2, 5, τ 3 ), (4, 3, τ 4 )} with τ 1 < τ 2 < τ 3 < τ 4 . We indicate a temporally coherent path from C-vertices 1 to 3 (with bold versus gray arrows). By comparison, there is no temporally coherent path from 2 to 3. Temporal (resp. spatial) arcs are horizontal (resp. vertical); temporal fibers are vertices along horizontal paths In the former case, we used a window of eight hours (this is different than in other representations, and was done for the sake of scaling), whereas in the latter case we used a window of 24 hours, as in other representations. Note that in the latter case β 0 is also the same as for the MLP representation. For some later times with both networks, β 2 could not be computed due to memory constraints, but was always zero when computed, including near the beginning of the MathOverflow DCN. Recomputing the Betti numbers for p < 2 required just a few minutes The idea behind this conjecture is that "flips break squares" (where "squares" are in the sense of Grigor'yan et al. (2012)) and contribute to β 1 , whereas instantaneous back-andforth contacts other than in the first and last instances prevent such contributions by adding enough terms to cause algebraic cancellations.
It might be tempting to consider a slightly different temporal network representation that allows "diagonal" arcs of the sort that seem necessary to support nontrivial path homology for p > 1 . For example, we might replace a "temporal" arc followed by a "spatial" arc with a single "diagonal" arc. Such a representation indeed gives rise to nontrivial path homology for p > 1 . However, this representation is not robust and/or meaningful unless the timestamps can actually differentiate between back-and-forth contacts that are merely close together in time versus exactly simultaneous, since these two cases can give different homologies. Meanwhile, if exact simultaneity is possible, the temporal network must actually inhabit discrete (if granular) time, and in this event the temporal digraph representation is essentially the same as the layered representation of §, albeit with self-transitions automatically included. Taking the other observations of this section into account, we therefore see that the temporal digraph representation and its siblings either do not yield substantive insight, or require still further nuance to apply fruitfully.

Comparison with other activity measures
Thus far, we have performed extensive analysis on the dependency of path homology on the chosen representation of a temporal network. For completeness, we now compare path homology against two popular network measures-density and clustering coefficient-to show that path homology captures essentially different features of network data from these classical measures.
Recall that the density of a network describes the ratio of the number of true connections to the number of possible connections. For a digraph D = (V , A) , the density is given by |A| |V |·(|V |−1]) . Also recall that the clustering coefficient measures how often the neighbors of an edge are themselves neighbors. For the directed setting, we appeal to the notion of clustering coefficient developed in [?]. Following [?], we temporarily write A to denote the adjacency matrix (instead of the set of arcs), d j to denote total (i.e., in-plus out-) degree of vertex j, and d ↔ j to denote the number of vertices that have arcs to and from j. Then the (directed) clustering coefficient is given by: Overall, we find that density and clustering coefficient are generally uncorrelated with path homology Betti numbers (cf. Figs. 19,21,and 22). There is one key exception to this rule (cf. Fig. 20): C and β 2 are fairly strongly correlated when mutual dyad motifs W n are present either in large numbers or for n large. The reason for this is simple: C essentially measures the number of triangles, and in the event above, there are many triangles to be found.
However, C and β 2 are totally uncorrelated in layered representations of temporal networks for the equally simple reason that, by construction, the digraphs involved never have triangles. Meanwhile, in each temporal network considered here, the density and Betti numbers are also essentially uncorrelated (not shown here except for the MathOverflow network, which is broadly phenomenologically representative in this respect).
While a comprehensive comparison to network measures beyond density and clustering coefficient is out of the scope of the current paper, we make a few additional remarks about the general landscape of such statistical network measures before concluding this section. A reader more comfortable with, e.g., network centralities or motif analysis may wonder why we have not sought to compare them to the techniques of this paper. The reason is that centralities and motif analysis (along with density and clustering coefficient) are intrinsically quantitative measures focused on the statistical properties of nodes, edges, and subgraphs in toto, while path homology is an intrinsically qualitative measure focused on generic characteristics of individual subgraphs that may or may not be present in data. Meanwhile, clustering or community detection in digraphs also answers a fundamentally different question (viz., when/where are interconnected regions?) than we address (viz., when/where are certain directionally coordinated activities?).

Remarks
As a generalization of simplicial homology, path homology holds the promise to advance the state-of-the-art in applications of algebraic topology to network science. In this work, we have addressed two critical bottlenecks in the application of path Fig. 19 Clustering coefficient (6) and density for the windowed representation of the MathOverflow temporal network, with instances of β 2 > 0 circled  (6) and (right) density versus for the windowed representation of the email temporal network. The strong correlation between β 2 and C is entirely due to the particular structure of the mutual dyad motifs that contribute to each activity measure. In other temporal network representations, or when these motifs are not dominant, these activity measures will not remain correlated (cf. Figs. 19 and 21)  (6) and (right) density for the windowed representation of the Facebook temporal network. The Betti number is essentially uncorrelated to either traditional activity measure  (6) and (right) density for the layered representation of the MathOverflow temporal network. The clustering coefficient is always trivial for such representations and the density is essentially uncorrelated to β 1 homology: (1) fast computation, and (2) intuition for the phenomena that it captures. Specifically, we have developed an efficient implementation (Yutin 2020) and reported on an exhaustive library of computational examples that we have in turn used to explain the subgraphs in real-world networks that drive path homology activity. Looking to the future, our work could be a precursor to follow-up works that combine network dictionary learning (Lyu et al. 2020;Xu 2020;Vincent-Cuaz et al. 2021) with topological signatures (Gómez and Mémoli 2021).
Toward a comprehensive analysis of real-world directed contact networks (DCNs), we employed three different views of the data: window-aggregated, layered, and temporal digraph representations. From our experiments it is evident that the structure and even existence of path homology-carrying subgraphs can vary greatly depending on the network representation. However, one key takeaway is that the bulk dynamics indicated by the time series of Betti numbers is more robust to the particulars of the representation, particularly in the case of window-aggregated and layered representations. We also note that although the computational requirements for path homology scale exponentially with homological dimension, even the two-dimensional case can highlight salient network structure and behavior. By using the window-aggregated or layered representations, path homology can be successfully brought to bear in this regard, illuminating both subgraphs with nontrivial path homology as well as the temporal networks themselves.
We also observed that although temporal digraph representations have an obvious advantage of efficiently representing temporally coherent paths, currently they do not appear capable of yielding nontrivial path homology in higher dimensions. The possibility of developing a similar but robust and meaningful temporal network representation that can encode temporal coherence while also admitting nontrivial path homology in higher dimensions is an interesting avenue for future work. Overall, our findings suggest that different network representations of DCNs yield different perspectives of the data (see Singh et al. (2007); Lum et al. (2013) for related approaches), and considering families of path homology profiles built on top of a variety of network representations (cf. ) may provide the most comprehensive understanding of the data.