 Research
 Open Access
 Published:
Multiplex graph matching matched filters
Applied Network Science volume 7, Article number: 29 (2022)
Abstract
We consider the problem of detecting a noisy induced multiplex template network in a larger multiplex background network. Our approach, which extends the graph matching matched filter framework of Sussman et al. (IEEE Trans Pattern Anal Mach Intell 42(11):2887–2900, 2019) to the multiplex setting, utilizes a multiplex analogue of the classical graph matching problem to use the template as a matched filter for efficiently searching the background for candidate template matches. The effectiveness of our approach is demonstrated both theoretically and empirically, with particular attention paid to the potential benefits of considering multiple channels in the multiplex framework.
Introduction and background
Multilayer networks have proven to be useful models for capturing complex relational data where multiple types of relations are potentially present between vertices in the network (Boccaletti et al. 2014; Kivelä et al. 2014). Classical, single layer (or monoplex) networks, have proven popular for representing complex interactions (i.e., edges) amongst a collection of objects of interest (vertices). In the multilayer setting, there are multiple vertex layers and edges within a layer can encode different relationship/interaction types. Multilayer graphs are often distinguished by the structure across layers; in the present work we consider multiplex networks where interlayer relations can only exist amongst vertices representing the same object across the layers. For example, in connectomes (i.e., brain networks) different edge layers/modalities can represent different synapse types between neurons (e.g., chemical versus electrical gap junction synapses amongst the neurons in the C. elegans connectome White et al. 1986); in social networks different edge modalities can capture relationships/friendships in different social network platforms (Goga et al. 2015); in scholarly networks, different edge modalities can capture coauthorship across multiple classification categories (Ng et al. 2011). Insomuch as there is complimentary signal across the layers in the multilayer network (which is the case in many applications, see for example Mucha et al. 2010; Kivelä et al. 2014; Chen et al. 2016), leveraging the signal across the different layers of the network can lead to better, more robust performance than working within any single network modality. In this paper, we address the question of how we can practically (and theoretically) leverage the signal across multiple network layers in the inference setting of noisy subgraph detection.
Formally, the noisy subgraph detection problem can be stated as follows. For integer \(n>0\), let \(\mathcal {G}_n\) be the space of all nvertex labeled networks; note that unless otherwise specified, all graphs considered will be unweighted. Given a dissimilarity \(d:\mathcal {G}_m\times \mathcal {G}_m\mapsto \mathbb {R}\) (so that d is nonnegative, \(d(g,g)=0\) for all \(g\in \mathcal {G}_m\), and small (resp., large) values of d are indicative of highly similar (resp., dissimilar) networks), a template subgraph \(g\in \mathcal {G}_m\), and a background graph \(h\in \mathcal {G}_n\) with \(n>m\), the goal is to find the subgraphs \(g'\in \mathcal {G}_m\) of h minimizing \(d(g,g')\). Common dissimilarities d considered in the literature include variants of graph edit distance (Ebsch 2020) and the graph matching distance (Chan and Cheney 2020; Sussman et al. 2019) (see “The graph matching distance” section for detail). The \(\min _{g'\subset h}d(g, g')=0\) setting (where \(g'\subset h\) denotes \(g'\) is a subgraph of h) is closely related to the subgraph isomorphism problem—given a template g, determine if an isomorphic copy of g exists in the larger network h and find the isomorphic copy (or copies) if it exists. This NPcomplete problem has been the subject of voluminous research, with notable results in the setting where g is small or structured. Common approaches include those based on efficient tree search (Ullmann 1976), color coding (Alon et al. 1995, 2008), graph homomorphisms (Fomin et al. 2012), rulebased/filterbased matchings (Cordella et al. 2004; Moorman et al. 2018), constraint satisfaction (Larrosa and Valiente 2002; Zampelli et al. 2010), among others; for a survey of the literature circa 2012, see Lee et al. (2012). The potential to have \(\min _{ g'\subset h}d(g,g')>0\) (this is what is meant by “noisy”; note this is also referred to as inexact matching in the literature) for a given template separates the noisy subgraph detection task from the classical subgraph isomorphism problem, as this accounts for the reality that relatively large, complex subgraph templates may only errorfully occur in the larger background network. These errors may be due to missing edges/vertices in the template or background, and arise in a variety of real data settings (Priebe et al. 2015). When g is large and complex and/or \(\min _{g'\subset h}d(g,g')>0\), there are relatively fewer methods in the literature for attacking this problem; see, for example, Tu et al. (2020), Sussman et al. (2019), Du et al. (2017), DePiero and Krout (2003), Lladós et al. (2001). Moreover, for many d, in the setting of complex templates g there is often a single element (up to vertex reordering) in \(\mathop {\mathrm{arg}\,\mathrm{min}}\nolimits _{ g'\subset h}d(g,g')\), and the noisy subgraph detection problem reduces to detecting this unique minimizer.
The inference task we consider in this manuscript is the problem of detecting (possibly multiple copies of) a noisy multiplex template network in a multiplex background network. Letting \(\mathcal {M}^c_n\) be the set of all clayer, nvertex multiplex networks (see Definition 1), we can define the problem formally as follows. Given a dissimilarity \(d:\mathcal {M}^c_m\times \mathcal {M}^c_m\mapsto \mathbb {R}\) (so that d is nonnegative, \(d(\mathbf{g} ,\mathbf{g} )=0\) for all \(\mathbf{g} \in \mathcal {M}^c_m\), and small (resp., large) values of d are indicative of highly similar (resp., dissimilar) networks), a template subgraph \(\mathbf{g} \in \mathcal {M}^c_m\), and a background graph \(\mathbf{h} \in \mathcal {M}^c_n\) with \(n>m\), the goal is to find the subgraphs \(\mathbf{g} '\in \mathcal {M}^c_m\) of \(\mathbf{h}\) minimizing \(d(\mathbf{g} ,\mathbf{g} ')\). With the rise of network modeling of complex, multilayer data, there has been a recent emphasis on multiplex inference methods. This is, indeed, the case with the noisy multiplex subgraph detection problem, as there have been numerous recent papers dedicated to solving this problem; see, for example Yang and Liu (2016), Takes et al. (2017), Kopylov and Xu (2019), Kopylov et al. (2020), Moorman et al. (2021, 2018). This multilayer subgraph detection problem presents a host of additional challenges versus its single layer counterpart, including how to adapt monoplex dissimilarities to the multilayer setting (see Kivelä and Porter 2017 for the challenges in defining the general multilayer graph isomorphism), and how to appropriately leverage the signal across the multiple layers rather than more streamlined approaches that collapse the network to a single layer.
Our approach to approximately solving the multiplex noisy subgraph problem adapts the graph matching matched filter (GMMF) method proposed in Sussman et al. (2019) to the multiplex setting. The basic idea of the GMMF is to use the template graph \(\mathbf{g}\) to scan the background for similarly (according to d) structured subgraphs in \(\mathbf{h}\); this is similar in spirit to the matched filters for pattern detection in image analysis (e.g., Chaudhuri et al. 1989) and the convolutional filters in image processing and convolutional neural networks (Krizhevsky et al. 2012; O’Shea and Nash 2015). In our Multiplex GMMF (which we denote MGMMF), the scanning mechanism determining how the template is used to search the background is a multiplex adaptation of the FrankWolfe (Frank and Wolfe 1956) based FAQ algorithm of Vogelstein et al. (2014) (see “Multiplex FAQ” section for more detail). For the details of MGMMF, see “Multiplex graph matching matched filters” section. In the process of developing the MGMMF, we introduce a multiplex graph matching problem (see “The multiplex graph matching distance” section) that allows us to appropriately define a dissimilarity d that seeks to minimize the number of disagreements between \(\mathbf{g}\) and \(\mathbf{g} '\) (i.e., optimal induced subgraph match) or maximize the number of common edges across \(\mathbf{g}\) and \(\mathbf{g} '\) (i.e., optimal subgraph match).
In addition to the algorithmic development above, we further posit a statistical framework for embedding a noisy copy of \(\mathbf{g}\) into \(\mathbf{h}\) that allows us to study an analogue of multiplex graph matchability; see “Multiplex template matchability” section. In this model, we demonstrate the theoretical appropriateness of multiplex matched filters for detecting the noisy induced subgraph as well as the benefit of considering multiple channels for improved graph deanonymization. We further demonstrate these theoretical results in a number of simulation and real data settings. In “Experiments” section, we apply our multiplex matched filter approach in detecting a hidden template in a multiplex social media network and we compare MGMMF to other multiplex graph matching algorithms on a template discovery task in a large semantic property graph.
Notation: The following notation will be used throughout. For an integer \(n>0\), we will define \([n]:=\{1,2,\ldots ,n\}\), let \(\vec {1}_n\in \mathbb {R}^{n}\) (resp. \(I_n\in \mathbb {R}^{n\times n}\)) be the vector (resp. diagonal matrix) with all entries identically set to 1, let \(J_n\) be the \(n\times n\) matrix with all diagonal entries identically set to 0 and all offdiagonal entries set to 1. Let \(\mathbf {0}_{k,\ell }\) (resp., \(\mathbf {0}_{k}\)) be the \(k\times \ell\) (resp., \(k\times k\)) matrix with all entries identically set to 0. The set of \(n\times n\) permutation matrices is denoted by \(\Pi _n\). For a graph \(g\in \mathcal {G}_n\), we let V(g) denote the vertex set of g and E(g) denote the edge set of g. A graph \(g'\) is a subgraph of g (denoted \(g'\subset g\)) if \(V(g')\subset V(g)\) and \(E(g')\subset E(g)\).
The graph matching distance
Both the single and multilayer noisy subgraph detection problems defined above rely heavily on a suitable definition of graph dissimilarity d to measure the goodness of the fit of the recovered subgraph match. A common dissimilarity used in the literature is the graph matching distance \(d_{\text {GM}}\) defined as follows: Let \(g,h\in \mathcal {G}_n\) (recalling \(\mathcal {G}_n\) is the space of nvertex, labelled graphs; here these are assumed to be unweighted) and let A be the adjacency matrix for g, and B the adjacency matrix for h; we then define
where \(\Pi _n\) is the set of \(n\times n\) permutation matrices. Note that \(d_{\text {GM}}\) is a pseudodistance (i.e., \(d_{\text {GM}}(g,g)=0\) for all \(g\in \mathcal {G}_n\), \(d_{\text {GM}}\) is symmetric, and satisfies the triangle inequality) as it is clear that \(d_{\text {GM}}(g,h)=0\) can hold for \(g\ne h\). See Bento and Ioannidis (2019) for a study of a number of related graph matching (pseudo) distances.
There is an immense literature devoted to solving the related graph matching problem: Given graphs \(g,h\in \mathcal {G}_n\) (with respective adjacency matrices A and B), find \(P\in \Pi _n\) such that \(d_{\text {GM}}(g,h)=\Vert APPB\Vert _{F}\); see Conte et al. (2004), Foggia et al. (2014), EmmertStreib et al. (2016), Yan et al. (2016) for a thorough review of the current graph matching literature. Briefly, the graph matching literature is composed of two (intertwined) main thrusts: algorithmic development, with popular methods including optimization/relaxationbased approaches (e.g., Zaslavskiy et al. 2009; Umeyama 1988; Vogelstein et al. 2014; Klau 2009; Zhang and Tong 2016), treesearch (e.g., Pinheiro et al. 2016), deep learningbased approaches (e.g., Zanfir and Sminchisescu 2018), representationlearning approaches (e.g., Heimann et al. 2018; Du et al. 2020), spectral approaches (e.g., Feizi et al. 2015; Fan et al. 2020; Egozi et al. 2012; Singh et al. 2008), and structured factorizationbased approaches (e.g., Zhou and De la Torre 2012; Zhang et al. 2019), to name but a few. The second thrust of the graph matching literature is focused on graph deanonymization/graph matchability. Stated succinctly, the problem of graph matchability is as follows: given a latent alignment P between two networks g and h (with respective adjacency matrices A and B), is it the case that \(d_{\text {GM}}(g,h)=\Vert APPB\Vert _{F}\); i.e., will the optimal “graph matching” permutation recover the latent alignment? The latent alignment between g and h is often achieved by correlating the networks g and h edgewise (so that h can be thought of as a noisy version of g), and the matchability literature is often focused on deriving sharp phasetransitions for when \(d_{\text {GM}}(g,h)\ne \Vert APPB\Vert _{F}\) in terms of decaying correlation across networks; see, for example, Ding et al. (2018), Kazemi et al. (2015a, b), Pedarsani and Grossglauser (2011), Onaran et al. (2016), Lyzinski and Sussman (2020), Cullina and Kiyavash (2017, 2016), Barak et al. (2018), Cullina et al. (2018) for a litany of graph matchability results in the single layer setting.
In the noisy subgraph detection problem outlined above, where \(g\in \mathcal {G}_m\) and \(h\in \mathcal {G}_n\) for \(n>m\) (with respective adjacency matrices A and B), there are a number of ways of adapting \(d_{\text {GM}}\) to the setting where the graphs are different sizes (see, for example, Appendix F of Bento and Ioannidis 2019). If we desire our dissimilarity to penalize extraneous structure (i.e., nonedges mapped to edges) and reward both common edges and common nonedges, we can seek a subgraph \(g'\) (with adjacency matrix \(B'\)) of h that minimizes \(\min _{P\in \Pi _m}\Vert AP PB'\Vert _F\). This is effectively minimizing \(d_{\text {GM}}(g,\cdot )\) over subgraphs of h of order m, which amounts to finding the closest (according to \(d_{\text {GM}}\)) induced subgraph of h to g. Equivalently, this can be cast as seeking an element P in
minimizing \(\Vert APBP^T\Vert _{F}\) (see Remark 1 below for detail). If we desire our dissimilarity to only reward recovered structure, we can seek a subgraph \(g'\) (with adjacency matrix \(B'\)) of h that maximizes \(\max _{P\in \Pi _m}\text {trace}(AP (B')^{T}P^T)\). This is effectively minimizing
over subgraphs \(g'\) of h of order m. Equivalently, this can be cast as seeking an element P in \(\Pi _{m,n}\) maximizing \(\text {trace}(APB^{T}P^T)\) (see Remark 1 below) for detail, which amounts to finding the closest (closest in that it has the most common edges) subgraph of h to g. Note that these two distances/dissimilarities can be obtained from \(d_{\text {GM}}\) directly by appropriately weighting and padding g and h; see Fishkind et al. (2015), Sussman et al. (2019) and Remark 1.
Remark 1
Let \(g\in \mathcal {G}_m\) with adjacency matrix A and \(h\in \mathcal {G}_n\) with adjacency matrix B. In Fishkind et al. (2019), the method for finding the closest (according to \(d_{\text {GM}}\)) induced subgraph of h to g was to match
to \(2B\vec {1}_n \vec {1}_n^T\) using \(d_{\text {GM}}\). For \(Q\in \Pi _{n}\), write
where \(P\in \Pi _{m,n}\). Then, matching \((2A\vec {1}_m \vec {1}_m^T)\oplus \mathbf {0}_{nm}\) to \(2B\vec {1}_n \vec {1}_n^T\) is equivalent to maximizing
and finding the optimal Q for Eq. (1) is equivalent to finding the optimal \(P\in \Pi _{m,n}\) minimizing
Moreover, the above derivations show us that for \(P\in \Pi _{m,n}\), minimizing \(\Vert APBP^T\Vert _F^2\) is not equivalent to maximizing \(\text {trace}(APB^TP^T)\). Indeed, in Fishkind et al. (2019), the best fitting (noninduced) subgraph of h to g is found by maximizing (over \(Q\in \Pi _n\))
which is easily seen to be equivalent to maximizing
over \(P\in \Pi _{m,n}\) as claimed.
Remark 2
The graph matching distance can be easily adapted to the weighted graph setting. Letting \(\mathcal {W}_n\) be the set of all nvertex graphs with nonnegative edge weights, we can associate \(g=(V,E,W)\in \mathcal {W}_n\) with a weighted adjacency matrix A so that \(A_{i,j}=\mathbbm {1}\{\{i,j\}\in E\}W(\{i,j\})\). For \(g,h\in \mathcal {W}_n\) with respective weighted adjacency matrices A and B, we define \(d_{\text {GM}}(g,h)=\min _{P\in \Pi _n}\,\Vert APPB\Vert _{F}.\) Extensions to the setting where \(g\in \mathcal {W}_m\) and \(h\in \mathcal {W}_n\) are then achieved as above.
Multiplex networks
Before lifting the graph matching distance/problem to the multiplex setting, we first need to define precisely what we mean by a multiplex graph. At the core of this paper, we view the template as being equal or lower order than the background. Moreover, our definition of multiplex networks, ideally, would allow for differing graph orders across the multiple layers within a single graph. To allow for these expected data nuances in the multiplex setting, we consider the following multiplex graph model; see Kivelä et al. (2014) for a thorough overview of this and other multiplex network formulations.
Definition 1
The ctuple \(\mathbf{g} =(g_1,g_2,\dots ,g_c)\) is an nvertex multiplex network if for each \(i=1,2,\dots ,c,\) we have that \(g_i\in \mathcal {G}_{n_i}=\{n_i\)vertex labeled graphs\(\}\), and the vertex sets \((V_i=V(g_i))_{i=1}^{c}\) further satisfy the following:

(i)
For each \(i\in [c]\), we have that \(V(g_i)\subseteq [n]\);

(ii)
\(\bigcup \limits _{i=1}^{c}V(g_i)=[n]\);

(iii)
For each \(\mathcal {I}\subset [c]\), we have that \((\cup _{i\in \mathcal {I}}V(g_i))\cap (\cup _{j\notin \mathcal {I}}V(g_j))\ne \emptyset\);

(iv)
The layers are a priori node aligned; i.e., vertices sharing the same label across layers correspond to the same entity in the network.
Note that each vertex \(v\in [n]\) need not appear in each channel \(i\in [c]\), however, we do require that we cannot partition the graph layers into two sets of vertexdisjoint networks. We will denote the set of clayer, nvertex multiplex networks via \(\mathcal {M}_n^{c}\). We note here that in the subsequent sections we will describe the MGMMF and its associated primitives in the setting of unweighted multiplex networks. This is done to streamline exposition, and analogues in the weighted graph setting can be defined accordingly.
The multiplex graph matching distance
As mentioned previously, in the single layer setting the two subgraph matching distances—the ones based on \(d_{\text {GM}}\) and \({{\,\mathrm{d^{(2)}_{\mathrm {GM}}}\,}}\)—can be obtained from \(d_{\text {GM}}\) via appropriate padding schemes applied to the template g and background h. This will be the approach we take in order to define the analogues of these distances in the multiplex setting, as this will allow us to circumvent the difficulties of differing number of vertices across graphs and across the layers of a given graph. To this end, we consider the following pair of padding schemes (adapted here from Fishkind et al. 2019; Sussman et al. 2019). Letting \(\mathbf{g} \in \mathcal {M}_m^{c}\) and \(\mathbf{h} \in \mathcal {M}_n^{c}\) with \(m\le n\), we define

i.
(Naive Padding) For each \(i\in [c]\), define the adjacency matrices \(\widetilde{A}_i\in \mathbb {R}^{m\times m}\) and \(\widetilde{B}_i\in \mathbb {R}^{n\times n}\) via
$$\begin{aligned} {\widetilde{A}}_i(u,v)= & {} {\left\{ \begin{array}{ll} 1 &{} \text {if u,v}\in V(g_i), \text { and}\,{ \{\text { u,v}\}}\in E(g_i); \\ 0 &{} \text {if u,v}\in V(g_i), \text { and}\,{ \{\text { u,v}\}}\notin E(g_i); \\ 0 &{} \text {if u or v}\in [m]\setminus V(g_i); \\ \end{array}\right. }\\ {\widetilde{B}}_i(u,v)= & {} {\left\{ \begin{array}{ll} 1 &{} \text {if u,v}\in V(h_i), \text { and}\,{ \{\text { u,v}\}}\in E(h_i); \\ 0 &{} \text {if u,v}\in V(h_i), \text { and}\,{ \{\text { u,v}\}}\notin E(h_i); \\ 0 &{} \text {if u or v}\in [n]\setminus V(h_i); \\ \end{array}\right. } \end{aligned}$$Denote \({\widetilde{\mathbf{ A }}}=(\widetilde{A}_1,\widetilde{A}_2,\ldots ,\widetilde{A}_c)\) and \({\widetilde{\mathbf{B }}}=(\widetilde{B}_1,\widetilde{B}_2,\ldots ,\widetilde{B}_c)\). This padding scheme effectively appends the number of isolated vertices to each graph layer needed in order to make the number of vertices in all layers equal (equal m in \(\mathbf {g}\) and n in \(\mathbf {h}\)).

ii.
(Centered Padding) For each \(i\in [c]\), define the weighted adjacency matrices \(\widehat{A}_i\in \mathbb {R}^{m\times m}\) and \(\widehat{B}_i\in \mathbb {R}^{n\times n}\) via
$$\begin{aligned} & \widehat{{A_{i} }}(u,v) = \left\{ {\begin{array}{*{20}l} 1 \hfill & {{\text{if}}\;{\text{u}},{\text{v}} \in V(g_{i} ),\;\;{\text{and}}\;\{ {\text{u}},{\text{v}}\} \in E(g_{i} );} \hfill \\ {  1} \hfill & {{\text{if}}\;{\text{u}},{\text{v}} \in V(g_{i} ),\;\;{\text{and}}\;\{ {\text{u}},{\text{v}}\} \notin E(g_{i} );} \hfill \\ 0 \hfill & {{\text{if}}\;{\text{u}}\;{\text{or}}\;{\text{v}} \in [m]\backslash V(g_{i} );} \hfill \\ \end{array} } \right. \\ & \widehat{{B_{i} }}(u,v) = \left\{ {\begin{array}{*{20}l} 1 \hfill & {{\text{if}}\;{\text{u}},{\text{v}} \in V(h_{i} ),\;\;{\text{and}}\;\{ {\text{u}},{\text{v}}\} \in E(h_{i} );} \hfill \\ {  1} \hfill & {{\text{if}}\;{\text{u}},{\text{v}} \in V(h_{i} ),\;\;{\text{and}}\;\{ {\text{u}},{\text{v}}\} \in E(h_{i} );} \hfill \\ 0 \hfill & {{\text{if}}\;{\text{u}}\;{\text{or}}\;{\text{v}} \in [n]\backslash V(h_{i} );} \hfill \\ \end{array} } \right. \\ \end{aligned}$$(2)Denote \({\widehat{\mathbf{A }}}=(\widehat{A}_1,\widehat{A}_2,\ldots ,\widehat{A}_c)\) and \({\widehat{\mathbf{B }}}=(\widehat{B}_1,\widehat{B}_2,\ldots ,\widehat{B}_c)\). This padding scheme makes the number of vertices in all layers equal, as well as penalizes nonexistent edges.
For an example of the two padding schemes applied to a graph in \(\mathcal {M}_8^2\), see Figure 1.
We are now ready to define the multiplex analogues of our subgraph matching dissimilarities. Let \(\mathbf {g}\in \mathcal {M}^c_m\) (with naive and centered paddings given by \(\widetilde{\mathbf {A}}\) and \(\widehat{\mathbf {A}}\) resp.,) and \(\mathbf {h}\in \mathcal {M}^c_n\) with \(m<n\) (with naive and centered paddings given by \(\widetilde{\mathbf {B}}\) and \(\widehat{\mathbf {B}}\) resp.,). The multiplex analogue of \(d^{(2)}_{\text {GM}}\), that only rewards recovered common edge structure, then seeks a subgraph \(\mathbf {g'}\in \mathcal {M}^c_m\) of \(\mathbf {h}\) (with corresponding naive and centered paddings given by \(\widetilde{\mathbf {B}}'\) and \(\widehat{\mathbf {B}}'\) resp.,) that minimizes
For matrices \(A\in \mathbb {R}^{m\times m}\) and \(B\in \mathbb {R}^{nm\times nm}\), we consider the matrix direct sum \(A\oplus B\in \mathbb {R}^{n\times n}\) defined via
where we recall for integral \(k,\ell >0\), \(\mathbf{0} _{k,\ell }\) (resp., \(\mathbf{0} _{k}\)) is the \(k\times \ell\) (resp., \(k\times k\)) matrix of all 0’s. Finding the mvertex subgraphs of \(\mathbf {h}\) minimizing \(d^{(2)}_{\text {MGM}}(\mathbf {g},\cdot )\) is equivalent to finding all P in
as each P in Eq. (3) matches \(\mathbf {g}\) to a subgraph of \(\mathbf {h}\) minimizing \(d^{(2)}_{\text {MGM}}\). This formulation in Eq. (3) effectively seeks to maximize the number of common edges between the multiplex template and multiplex background, where all edges across all channels are weighted equally (see Fishkind et al. 2019; Bento and Ioannidis 2019 for the monoplex analogue).
The multiplex analogue of the dissimilarity that also penalizes extraneously recovered structure seeks an induced subgraph \(\mathbf {g'}\in \mathcal {M}^c_m\) of \(\mathbf {h}\) that minimizes the centered matching dissimilarity
Finding the mvertex subgraphs of \(\mathbf {h}\) minimizing \(d^{(c)}_{\text {MGM}}(\mathbf {g},\cdot )\) is equivalent to finding all P in
as each P in Eq. (4) matches \(\mathbf {g}\) to a subgraph of \(\mathbf {h}\) minimizing \(d^{(c)}_{\text {MGM}}(\mathbf {g},\cdot )\). If for each i, we have that \(V(h_i)=[n]\supset [m_i]=V(g_i)\), then the formulation in Eq. (4) effectively seeks to minimize the number of disagreements (edge mapped to nonedge and vice versa) induced between the background and the matched subgraphs in the template, where all disagreements across all channels are weighted equally. Given the interpretation of Eqs. (3) and (4), the appropriate padding schemes to deploy in practice depends on the underlying problem assumptions and setting. In the sequel, we will let the optimization in Eq. (3) be known as the naive multiplex graph matching problem and the optimization in Eq. (4) be known as the centered multiplex graph matching problem.
Remark 3
Our formulation of the Multiplex GMP is (assuming channels of equal order across the adjacency matrix lists \(\mathbf{A}\) and \(\mathbf{B}\))
rather than a formulation weighting the matching in each channel via
for \(\lambda _i>0\). Note that we implemented the Multiplex GMP for different values of \(\lambda\), though in our experiments below we found that matching results with \(\lambda _i=1\) work suitably well while being more robust than other adhoc weighting strategies. As such, we only present the results with \(\lambda _i=1\), though finding optimal weight for a given graph sequence is an important learning task in its own right and is the subject of present research. Moreover, this allows each edge in each template channel to be weighted equally, which may be desirable. In the case that one or more channels is more informative or of higher import than the others, then choosing appropriate \(\lambda\)’s to overweight the matching in those channels may be desirable.
Multiplex graph matching matched filters
Our approach to approximately solving Eqs. (3) and (4) uses the template \(\mathbf{g}\) as a filter to search the background \(\mathbf{h}\) for approximate matches. Similar to how convolutional and matched filters are used in computer vision tasks, we use the template to scan the background for matching structure, though the mechanism for scanning is not exhaustive search but rather an adaptation of the FrankWolfe based matching algorithm FAQ of Vogelstein et al. (2014). Below we provide the details of our multiplex FAQ algorithm, as well as how this is implemented towards our goal of building effective multiplex graph matching matched filters.
Multiplex FAQ
One of the principle motivations for our additive multiplex graph distance formulation is that it allows us to adapt gradient descent based search algorithms to the multiplex setting with ease. One such algorithm that has proven useful in the literature is the FAQ algorithm of Vogelstein et al. (2014). This algorithm—based on the FrankWolfe method for constrained gradient descent applied to an indefinite relaxation of the graph matching problem—proceeds as follows (in the monoplex setting): Given graphs \(g,h\in \mathcal {G}_n\) with adjacency matrices A, B, a threshold \(\epsilon\), and initialization \(P^{(0)}\in \mathcal {D}_n\) (where \(\mathcal {D}_n\) is the set of doubly stochastic \(n\times n\) matrices, i.e., the convex hull of \(\Pi _n\)):

1.
While \(\Vert P^{(t)}P^{(t1)}\Vert _F>\epsilon\) at step \(t\ge 1\), repeat the following iterative loop:

i.
Letting \(f(P)=\text {trace}(APB^{T}P^T)\), compute \(\nabla f(P^{(t)})=A^TP^{(t)}BAP^{(t)}B^T\).

ii.
Find the optimal search direction for gradient descent by finding
$$Q^{(t)}=\mathop {\mathrm{arg}\,\mathrm{min}}\limits _{Q\in \mathcal {D}_n}\text {trace}(\nabla f(P^{(t)})^TQ);$$note that this is an instantiation of the famous linear assignment problem (LAP) and can be solved via the Hungarian method in \(O(n^3)\) time (Jonker and Volgenant 1987).

iii.
Set \(\alpha ^*\) equal to the minimizer of \(f(\alpha P^{(t)}+(1\alpha )Q^{(t)})\). This line search in the direction of \(Q^{(t)}\) can be done efficiently, as it amounts to optimizing a simple quadratic function of \(\alpha\).

iv.
Set \(P^{(t+1)}=\alpha ^* P^{(t)}+(1\alpha ^*)Q^{(t)}\).

i.

2.
Project the (possibly) interior point solution \(P^{(\text {final})}\) to the set of permutation matrices by solving the following LAP
$$\begin{aligned} P^*=\mathop {\mathrm{arg}\,\mathrm{min}}\limits _{P\in \Pi _n}\text {trace}\left( \nabla f(P^{(\text {final})})^T P\right) . \end{aligned}$$
The FAQ algorithm is theoretically principled (Lyzinski et al. 2015), performs well in a number of realdata alignment tasks (see Vogelstein et al. 2014 for a comparison of FAQ to other graph matching approaches), and is flexible enough to easily handle weighted and directed graphs, seeded vertices (Fishkind et al. 2019), matching graphs of different orders (Fishkind et al. 2019; Sussman et al. 2019), and, as we will see below, matching multiplex networks.
The above algorithm extends easily to the multiplex setting as follows. Let \(\mathbf{A} =(A_1,\ldots ,A_c)\) and \(\mathbf{A}' =(A'_1,\ldots ,A'_c)\) be the (possibly padded) adjacency matrix vectors of two multiplex graphs in \(\mathcal {M}^c_n\), where each \(A_i\) and \(A_i'\) is assumed to be an \(n\times n\) matrix, we consider minimizing the objective function
using the same FrankWolfe procedure outlined above. To wit, given a threshold \(\epsilon\) and initialization \(P^{(0)}\in \mathcal {D}_n\), we proceed via

1.
While \(\Vert P^{(t)}P^{(t1)}\Vert _F>\epsilon\) at step \(t\ge 1\), repeat the following iterative loop:

i.
Compute \(\nabla f_M(P^{(t)})=\sum _{i=1}^c A_i^TP^{(t)}A_i'A_iP^{(t)}(A_i')^T\).

ii.
Set \(Q^{(t)}=\mathop {\mathrm{arg}\,\mathrm{min}}\limits _{Q\in \mathcal {D}_n}\text {trace}(\nabla f_M(P^{(t)})^TQ)\)

iii.
Set \(\alpha ^*\) equal to the minimizer of \(f_M(\alpha P^{(t)}+(1\alpha )Q^{(t)})\).

iv.
Set \(P^{(t+1)}=\alpha ^* P^{(t)}+(1\alpha ^*)Q^{(t)}\).

i.

2.
Lastly compute \(P^*=\mathop {\mathrm{arg}\,\mathrm{min}}\limits _{P\in \Pi _n}\text {trace}(\nabla f(P^{(\text {final})})^T P).\) Return approximate solution \(P^*\).
This multiplex FAQ algorithm (denoted MFAQ) aims to, ideally, find the global minimizer of \(f_M(P)\). We will see in the next section how these ideas can be leveraged towards building our multiplex graph matching matched filter.
Remark 4
One of the advantages of FAQ and MFAQ is the ease in which it allows seeded vertices to be incorporated (Fishkind et al. 2019). In the setting of graph alignment, seeded vertices are those whose correspondence across networks is a priori known. This is incorporated into MFAQ as follows. Without loss of generality, letting the first \(s<n\) vertices of \(\mathbf{g },\mathbf{g }'\in \mathcal {M}^c_n\) (with respective adjacency matrix lists \(\mathbf{A}\) and \(\mathbf{A} '\)) be seeded via the identity mapping, we can consider incorporating these seeds via:

i.
A hard seeded approach: We seek to align \(\mathbf{g }\) and \(\mathbf{g }'\) by finding
$$\begin{aligned} \mathop {\mathrm{arg}\,\mathrm{min}}\limits _{P\in \Pi _{ns}}\sum _{i=1}^c\Vert A_i(I_s\oplus P)  (I_s\oplus P)A_i'\Vert ^2_F. \end{aligned}$$This effectively enforces the alignment across seeded vertices must hold in any optimal matching across \(\mathbf{g}\) and \(\mathbf{g}'\), thus reducing the dimension of the search space of the graph matching problem. In this case, MFAQ proceeds as outlined above with the objective function
$$\begin{aligned} f_M(P)=\sum _{i=1}^c\text {trace}(A_i(I_s\oplus P)(A_i')^T (I_s\oplus P)^T), \end{aligned}$$with the computation of the gradient and search steps proceeding analogously.

ii.
A soft seeded approach: We seek to align \(\mathbf{g}\) and \(\mathbf{g}'\) by finding
$$\begin{aligned} \mathop {\mathrm{arg}\,\mathrm{min}}\limits _{P\in \Pi _n}\sum _{i=1}^c\Vert A_iP  PA_i'\Vert ^2_F, \end{aligned}$$and initializing our optimization algorithm at a P of the form \(I_s\oplus Q\) (where the form of Q may be dictated by whether the algorithm relaxes the feasible region). This alignment across seeded vertices is encoded into the initialization though this alignment is not necessarily enforced in an optimal matching across \(\mathbf{g}\) and \(\mathbf{g}'\). This is more appropriate in settings where the seeds are potentially noisy or errorful (Fang et al. 2018).
In both the soft and hard seeded approaches, the performance of FAQ and MFAQ is often dramatically improved even with the inclusion of relatively few seeds (Fishkind et al. 2019; Patsolic et al. 2014).
Multiplex graph matching matched filters
In the setting of multiplex noisy subgraph detection, there may be many subgraphs in \(\mathbf{h}\) optimally matching to \(\mathbf{g}\). To account for this (and the presence of spurious local minima for the indefinite \(f_M(D)\) for D in \(\mathcal {D}_n\)), the multiplex graph matching matched filter (MGMMF) algorithm proceeds as follows: Use multiple random restarts of the MFAQ algorithm applied to either Eq. (3) or (4) (depending on the desired structure to recover) to find multiple candidate matches for the template \(\mathbf{g}\) in \(\mathbf{h}\). This approach was developed in the monoplex setting (using FAQ) in Sussman et al. (2019), with our contribution herein being the extension to the multiplex setting.
Given multiplex graphs \(\mathbf{g} \in \mathcal {M}^c_m\) and \(\mathbf{h} \in \mathcal {M}^c_n\) with \(m\le n\), an appropriate padding scheme (either naive or centered), a tolerance \(\epsilon \in \mathbb {R}>0\), a random initialization parameter \(\gamma\), and the number of restarts N, MGMMF proceeds as follows:

1.
Pad \(\mathbf{g}\) and \(\mathbf{h}\) accordingly. In the naive (resp., centered) padding regime, the padded adjacency matrix list of \(\mathbf{g}\) is denoted via \(\widetilde{\mathbf{A }}\) (resp., \(\widehat{\mathbf{A }}\)), and the padded \(\mathbf{h}\) via \(\widetilde{\mathbf{B }}\) (resp., \(\widehat{\mathbf{B }}\)).

2.
For \(k=1,2,\ldots ,N\), repeat the following

i.
Create a random initialization for the MFAQ algorithm via \(P^{(0)}\leftarrow \alpha \vec {1}_n\vec {1}_n^T/n+(1\alpha ) P\) where \(P\sim \text {Unif}(\Pi _n)\), \(\alpha \sim \text {Unif}[0,\gamma ]\).
Note that this is one possible random initialization scheme (the random jitter of the barycenter \(\vec {1}_n\vec {1}_n^T/n\) of \(\mathcal {D}_n\)).

ii.
In the naive (resp., centered) padding regime, find a candidate match for \(\mathbf{g}\) via \(P^*_k=\texttt {MFAQ}(\widetilde{\mathbf{A }}\oplus \mathbf {0}_{nm},\widetilde{\mathbf{B }},P^{(0)},\epsilon )\) (resp., \(P^*_k=\texttt {MFAQ}(\widehat{\mathbf{A }}\oplus \mathbf {0}_{nm},\widehat{\mathbf{B }},P^{(0)},\epsilon )\))

i.

3.
Rank the matchings \(\{P^*_1,P^*_2,\ldots ,P^*_N\}\) by increasing value of the multiplex graph matching objective function, Eq. (3) or (4), depending on the padding regime selected. This ranked list provides a candidate list of matchings/detected subgraphs for \(\mathbf{g}\) in \(\mathbf{h}\).
Note that code implementing the above MGMMF and MFAQ procedures can be downloaded as part of our R package, iGraphMatch, which is available on CRAN or can be downloaded at https://github.com/dpmcsuss/iGraphMatch.
Remark 5
Effectively, the MGMMF algorithm uses the multiplex template and a multiplex matching algorithm \(\mathcal {A}\) (MFAQ in our implementation) to search \(\Pi _n\) for relevant regions of \(\mathbf{h}\) that align well to \(\mathbf{g}\). The multiple restarts in Step 2 of the procedure are needed in the case of \(\mathcal {A}=\texttt {MFAQ}\) to account for multiple possible globally best matched and the fact that the objective function is relaxed to an indefinite quadratic program with myriad local minima in the feasible region. For approximate combinatorial \(\mathcal {A}\), the restarts may be appropriate as well, while for continuous, convex relaxation algorithms (see, for example, Bento and Ioannidis 2019), this step may not be necessary if we believe there is a single global minima.
Multiplex template matchability
In order to theoretically understand the efficacy of our MGMMF approach, we consider the following probabilistic framework (note that realdata experimentation demonstrating the empirical success of MGMMF is presented in “Experiments” section). Considering a random graph model for both \(\mathbf{g}\) and \(\mathbf{h}\) in which a stochastically noisy copy of \(\mathbf{g}\) has been embedded into \(\mathbf{h}\), we seek to understand to what degree an oracle algorithm (i.e., those that can exactly solve Eq. (3) and (4)) can identify the noisy copy of \(\mathbf{g}\) in \(\mathbf{h}\). This offers an important suitability test for MGMMF, as if even oracle approaches are not able to uncover probabilistically errorful copies of \(\mathbf{g}\), then practically MGMMF will most likely not be suitable for finding optimal subgraph matches in real data, high noise settings.
Moreover, to understand this problem further, we develop novel results on multiplex graph matchability (for \(\mathbf{g}\) and \(\mathbf{h}\) of the same order) that may be of independent interest, as well as generative multiplex error models in which to explore this question. The layout of this section is as follows. We develop a pair of error models, the ME and MS models, in “Background/template error models”. Within each error model, we tackle the question of template matchability in “Template matchability” section. Interspersed with these matchability results are simulations further illuminating the corresponding matching theory.
Background/template error models
We first consider a random graph model in the monoplex setting, in which a stochastically noisy copy of a template g has been embedded into the background h. Next, we leverage this model in the multiplex setting by proposing a pair of generative error models wherein the template \(\mathbf{g}\) is embedded as an errorful subgraph into the background \(\mathbf{h}\).
Definition 2
(See Arroyo et al. 2018) Consider a graph g with \(V(g)\subset [n]\). Let the centered, padded adjacency matrix (as in Eq. (2)) of g be denoted \(\widehat{A}\in \mathbb {R}^{n\times n}\). Let \(E\in [0,1]^{n\times n}\) be a symmetric, hollow (hollow indicating that the diagonal entries of E are set to 0) matrix. The graphvalued random variable \(\mathcal {E}(g)\) with vertex set equal to V(g) and random centered, padded adjacency matrix \(\widehat{A}_{g,E}\), which models passing g through an errorful channel E, is defined as follows. For each \(\{i,j\}\in \left( {\begin{array}{c}[n]\\ 2\end{array}}\right)\),
where \(X(i,j){\mathop {\sim }\limits ^{ind.}}Bern(E(i,j))\).
The two generative models we then consider for lifting Definition 2 to the multiplex setting are defined below. Each models a particular setting in which multilayer networks can appear in realdata settings.

i.
(Single Channel Source, Multiplex Error; abbreviated ME) There is a single nonrandom background source graph \(h\in \mathcal {G}_n\) and a nonrandom source template \(g=h[m]\in \mathcal {G}_m\), and two multichannel errorful filters \(\mathbf{E} ^{(1)}=(E^{(1)}_1,\ldots ,E^{(1)}_c)\), with each \(E^{(1)}_i\) acting on h, and \(\mathbf{E} ^{(2)}=(E^{(2)}_1,\ldots ,E^{(2)}_c)\), with each \(E^{(2)}_i\) acting on g. We observe \(\mathbf{H} =(\mathcal {E}^{(1)}_1(h),\ldots ,\mathcal {E}^{(1)}_c(h))\) as the multiplex background and \(\mathbf{G} =(\mathcal {E}^{(2)}_1(g),\ldots ,\mathcal {E}^{(2)}_c(g))\) as the multiplex template. By assumption, the errorful filters act independently across channels within \(\mathbf{G}\) and \(\mathbf{H}\), and independently across \(\mathbf{G}\) and \(\mathbf{H}\). In this model, by construction each \(V(G_i=\mathcal {E}^{(i)}_2(g))=[m]\) and each \(V(H_i=\mathcal {E}^{(i)}_1(h))=[n]\). This situation is used to model settings in which the multiplex layers can be thought of as errorful realizations of a single true but unknown network, and the edges across layers are of the same fundamental kind. For example, multiplex connectomes in which each layer graphically represents a (for example) DTMRI scan for a patient at a different point in time; see, for example, Zuo et al. 2014; Kiar et al. 2018.

ii.
(Single Channel Errors, Multiplex Source; abbreviated MS) The nonrandom background and nonrandom template source graphs are multiplex. To wit, let \(\mathbf{g} \in \mathcal {M}^c_m\) and \(\mathbf{h} \in \mathcal {M}^c_n\) satisfy the following: For each \(i\in [c]\), let \(\widehat{\mathbf{C }}\) and \(\widehat{\mathbf{D }}\) be the centered paddings of \(\mathbf{g}\) and \(\mathbf{h}\) respectively. We assume then that \(\widehat{C}_i=\widehat{D}_i[m]\) (i.e., \(\widehat{C}_i\)—the padded adjacency matrix of \(g_i\)—is the \(m\times m\) principal submatrix of \(\widehat{D}_i\)—the padded adjacency matrix of \(h_i\)). In this model, there are two multichannel errorful filters: \(\mathbf{E} ^{(1)}=(E^{(1)}_1,\ldots ,E^{(1)}_c)\) and \(\mathbf{E} ^{(2)}=(E^{(2)}_1,\ldots ,E^{(2)}_c)\). For each \(i\in [c]\), \(E^{(1)}_i\in \mathbb {R}^{n\times n}\) acts on \(h_i\), and \(E^{(2)}_i\in \mathbb {R}^{m\times m}\) acts on \(g_i\). We observe
$$\begin{aligned} \mathbf{H} =(\mathcal {E}^{(1)}_1(h_1),\mathcal {E}^{(1)}_2(h_2)\ldots ,\mathcal {E}^{(1)}_c(h_c)) \end{aligned}$$as the multiplex background, and
$$\begin{aligned} \mathbf{G} =(\mathcal {E}^{(2)}_1(g_1),\mathcal {E}^{(2)}_2(g_2),\ldots ,\mathcal {E}^{(2)}_c(g_c)) \end{aligned}$$as the multiplex template. As above, the errorful filters act independently across channels within \(\mathbf{G}\) and \(\mathbf{H}\), and independently across \(\mathbf{G}\) and \(\mathbf{H}\). Note that if the template (resp., background) channels have nonidentical vertex sets, then this will be preserved in the errorful template (resp., background). This situation is used to model settings in which we errorfully observe a given multiplex background and template. This is the case in many applications where the layers in \(\mathbf{G}\) and \(\mathbf{H}\) represent different edge types; e.g., a multiplex social network where the layers represent different friendship modalities (Magnani and Rossi 2011) or a multilayer connectome where the layers represent different synaptic types (Chen et al. 2016).
It may be convenient to view g and h (resp., \(\mathbf{g}\) and \(\mathbf{h}\)) as realizations from graphvalued random variables in the ME (resp., MS) model. In this case, we will assume the actions of the errorful filters on g and h (resp., \(\mathbf{g}\) and \(\mathbf{h}\)) are also independent of the random g and h (resp., \(\mathbf{g}\) and \(\mathbf{h}\)).
Considering the models above, in order for our MGMMF approach to possibly recover the true errorful induced subgraph of \(\mathbf{h}\) corresponding to \(\mathbf{g}\), we need for the global minimum of the Multiplex GMP to be in \(\mathcal {P}_{m,n}:=\{I_m\oplus P: P\in \Pi _{nm}\}\). This is the multiplex analogue of template matchability, i.e., uncovering conditions under which an oracle graph matching will recover a true but unknown latent vertex alignment. Here, that alignment is represented by \(\mathbf{g}\) being an errorful version of \(\mathbf{h} [m]\).
Template matchability
In this section, we will explore the benefit of considering multiplex versus monoplex networks when considering template matchability in both the MS and ME models. We note here that while the proof mechanism behind the theory underlying the matchability results in the multiplex setting differs only slightly from those used in the monoplex setting of Sussman et al. (2019), we stress that the end results are novel and demonstrate the utility of considering multiple channels in a novel manner. Recall below that for integer \(n>0\), \(J_n\) is the \(n\times n\) matrix with all diagonal entries identically set to 0 and all offdiagonal entries set to 1.
In the MS model, let \(\mathbf{g} \in \mathcal {M}_m^c\) and \(\mathbf{h} \in \mathcal {M}_n^c\) be the respective template and background source graphs, with respective centered, padded adjacency matrices given respectively by \(\widehat{\mathbf{C }}\) and \(\widehat{\mathbf{D }}\) satisfying \(\widehat{C}_i=\widehat{D}_i[m]\) for all \(i\in [c]\). Assume that the errorful filters satisfy for each \(i\in [c]\), \(E_i^{(1)}=q_i J_n\) and \(E_i^{(2)}=s_i J_m\) (where \(s_i=s_i(n)\) and \(q_i=q_i(n)\) are allowed to vary with n). If \(c=1\), and \(s_1,q_1=1/2\), then the observed background and template are effectively independent ER(n, 1/2) and ER(m, 1/2) networks, respectively. It is immediate then that the optimal permutation aligning the background to the template will almost surely not be in \(\mathcal {P}_{m,n}\). If \(c>1\), can strongly correlated signal present in less noisy channels be used to counteract the excess error caused by noisier channels?
The next result tackles this from a template matchability perspective. The main result in the following theorem (Theorem 1) is that in the correlated model outlined above, if there are sufficiently many positively correlated channels relative to the number of negatively correlated channels, then under mild assumptions on the growth of m, the subgraph of \(\mathbf{H}\) minimizing the centered graph matching dissimilarity \(d^{(c)}_{\text {MGM}}(\mathbf {G},\cdot ),\) is, with high probability, uniquely the embedded noisy copy of \(\mathbf{G}\) in \(\mathbf{H}\); note that the proof of Theorem 1 can be found in “Appendix 1.1”. While an oracle procedure for finding the minimizer of \(d^{(c)}_{\text {MGM}}(\mathbf {G},\cdot ),\) is unknown, this result gives credence to the matched filter approach, as the global optimizing alignment sought by the multiplex matched filter will (with high probability) be the right alignment.
Theorem 1
In the MS model framework, suppose that there exist constants \(\alpha <1\), \(\beta >0\), and \(n_0\in \mathbb {Z}>0\) such that for all \(n>n_0\), \(m=m(n)\) satisfies \(m^{\alpha }>\beta \log n\). Suppose that for \(p_i=p=p(n)\) we have that each \(h_i{\mathop {\sim }\limits ^{ind.}}\)ER(n, p), and that \(s_i(n)=s(n)=s<1/2\). For \(c_1\) channels, let \(q_i=q<1/2\), for \(c_2\) channels, let \(q_i=1/2\), and for \(c_3=cc_1c_2\) channels let \(q_i=1q>1/2\) (where \(c_1=c_1(n)\), \(c_2=c_2(n)\), \(c_3=c_3(n)\), \(s=s(n)\) and \(q=q(n)\) are allowed to vary with n). Then there exist constants \(\gamma ,\xi >0\), and \(n_2\in \mathbb {Z}>0\) such that if for all \(n>n_2\),
then for \(n>\max (n_0,n_2)\), we have
If s, q, c, \(c_1\) and \(c_2\) are fixed constants that do not vary with n, we need only require \(c_1>c_3\) rather than Condition (5).
To derive analogous results to Theorem 1 in the ME model, we consider the following setting. Letting \(H\sim ER(n,p=p(n))\in \mathcal {G}_n\) with \(p\le 1/2\), and \(G=H[m]\) be the respective background and template source graphs, we again assume that there exist constants \(\alpha \le 1,\) \(\beta >0\), and \(n_0\in \mathbb {Z}>0\) such that for all \(n>n_0\), \(m=m(n)\) satisfies \(m^\alpha \ge \beta \log n\). Further assume that for each \(i\in [c=c(n)]\) the errorful filters satisfy,
In the ME setting, we then have the following theorem whose proof is presented in “Appendix 1.2”. The result in Theorem 2 is similar to that of Theorem 1: In the correlated ME model outlined above, if there are sufficiently many positively correlated channels (those in \(c_3\) and \(c_4\)) relative to the number of negatively correlated channels (those in \(c_1\) and \(c_2\)), then under mild assumptions the subgraph of \(\mathbf{H}\) minimizing the centered graph matching dissimilarity \(d^{(c)}_{\text {MGM}}(\mathbf {G},\cdot ),\) is again, with high probability, uniquely the embedded noisy copy of \(\mathbf{G}\) in \(\mathbf{H}\).
Theorem 2
With setup as above, suppose that \(\alpha <1\). For
where \(e_1=e_1(n)\) and \(e_2=e_2(n)\) can vary in n and \(c=c_1+c_2+c_3+c_4\). Then there exist constants \(\gamma ,\xi >0\), and \(n_2\in \mathbb {Z}>0\) such that if for all \(n>n_2\)
then for \(n>\max (n_0,n_2)\),
If \(e_1\), \(e_2\), and c are fixed in n, we need only require \(c_1+c_2<c_3+c_4\) for the bound in Eq. (9) to hold for sufficiently large n.
As mentioned above, in Theorem 2, the channels in \(c_3\) and \(c_4\) represent those in which the template and background are positively correlated, with those in \(c_1\) and \(c_2\) representing those in which the template and background are negatively correlated. We can then interpret Eq. (8) as asserting that sufficiently many positively correlated channels can overcome the signal loss in negatively correlated channels. This is a unique benefit of multiple layers versus the monoplex setting, as in the single layer setting there is no universal way to overcome the negative impact of anticorrelation on graph matchability. In the following sections, we further empirically explore the effect of negatively correlated/high noise channels on multiplex matchability.
Strength in numbers
Consider \(c_2=c_3=0\) in Theorem 1. Condition (6) then reduces to
and sparsity in the background (i.e., \(p\approx 0\)) and large noise values (i.e., close to 1/2) for s and q can be mitigated by choosing an appropriately large c; effectively, multiple channels can amplify the weak signal present in each individual channel.
We explore this further in the following experiment. We will look at two different cases, first considering when \(m=n\) and then when \(m<n\). First, we consider \(n=m=100\) (to mitigate possible effects of template order on matching accuracy), and we let \(\mathbf{G} ,\mathbf{H} \in \mathcal {M}^{c}_{100}\) for c ranging over \(\{1,2,\ldots ,10\}\). For each \(i\in [c]\), we have that \((G_i,H_i)\sim \mathrm {ER}(100,0.5,\rho )\) (so that \(G_i\) and \(H_i\) are marginally ER(100,0.5) and edges across graphs are independent except that for each \(\{j,k\}\in \left( {\begin{array}{c}[100]\\ 2\end{array}}\right)\), we have that corr\((\mathbbm {1}\{\{j,k\}\in E(G_i)\},\mathbbm {1}\{\{j,k\}\in E(H_i)\})=\rho\)). Within this model, the channels are endowed with a natural vertex correspondence across \(G_i\) and \(H_i\), namely the identity mapping as identically labeled edges are correlated while edges with different vertex labels are independent. Note that in the \(h_i\sim ER(n,p_i)\) MS model setting, we have that
so that the correlation between edges in \(G_i\) and \(H_i\) can be made positive or negative with judiciously chosen \(s_i\) and \(q_i\). Considering \(\rho\) varying over \(\{0.1,0.2,0.3,0.4,0.5\}\), we match \(\mathbf{G}\) and \(\mathbf{H}\) using MFAQ using \(s=10\) hard seeded vertices (see Remark 4). Results are plotted in Figure 2. In the figure, we plot the mean matching accuracy (i.e., the fraction of vertices whose latent alignment is recovered correctly) of MFAQ versus c, averaged over 2000 Monte Carlo replicates. For each choice of parameters, we also plot (via the partially transparent points) the accuracy distribution corresponding to the MC replicates. In red (resp., olive, green, blue, purple) we plot the results for \(\rho =0.1\) (resp., \(\rho =0.2\), \(\rho =0.3\), \(\rho =0.4\), \(\rho =0.5\)). From Figure 2, we see the expected relationship: in low correlation settings where MFAQ is unable to align the monoplex graphs, this can often be overcome by considering \(c>1\). Indeed, in all cases, save \(\rho =0.1\), perfect matching is achieved using \(c\ge 8\) channels.
Next, we look at the case when \(m<n\). In addition to examining the effect of multiple channels when weak signal is present across channels, we wish to compare the effect of different padding schemes (Naive vs Centered) in terms of the matching accuracy. Here, we are following the MS model with the background (resp., template) channels independent and marginally distributed as ER(n, 1/2) (resp., ER(m, 1/2)). Here the template is correlated edgewise to the induced subgraph of the first m vertices in the background (as in the MS model), with edge correlation \(\rho\). We analyze the padding schemes’ effectiveness by first varying the values of the correlation \(\rho \in \{0.1,0.2,0.3,0.4,0.5\}\) while keeping \(n=500\), and \(m=100\) constant (see Figure 3) and second, by varying the background size \(n\in \{100,500,1000,2000\}\) while the template size \(m=100\) and the correlation \(\rho =0.5\) remain constant (see Figure 4). Letting the Naive (resp. Centered) padded template/background pair have adjacency matrix lists \((\widetilde{\mathbf{G }},\widetilde{\mathbf{H }})\) (resp. \((\widehat{\mathbf{G }},\widehat{\mathbf{H }})\)), in each figure we consider matching the template to the background using MFAQ (with 10 hard seeds) while varying c over \(\{1,2,\ldots ,10\}\). Results are plotted in Figures 3 and 4. As in Figure 2, we plot the mean matching accuracy (i.e., the fraction of template vertices whose true latent alignment is recovered correctly in the background) of MFAQ versus c, here averaged over 100 MC replicates. For each choice of parameters, we also plot (via the partially transparent points) the accuracy distribution corresponding to the MC replicates. In Figure 3, in red (resp., olive, green, blue, purple) we plot the results for \(\rho =0.1\) (resp., \(\rho =0.2\), \(\rho =0.3\), \(\rho =0.4\), \(\rho =0.5\)). In Figure 4, in red (resp., green, blue, purple) we plot the results for \(n=100\) (resp., \(n=500\), \(n=1000\), \(n=2000\)).
All figures demonstrate that even though the MFAQ algorithm is unable to align the multiplex template to its correlated subgraph in the background when \(c=1\), this can often be overcome by considering \(c>1\). Moreover, Figures 3 and 4, show that the Centered Padding scheme achieves better matching accuracy between channels than the Naive Padding scheme, and that the matching accuracy increases as the correlation increases for both padding schemes. Finally, we observe the unsurprising phenomena in Figure 4 that the matching accuracy decreases as the ratio between the template size and the background size decreases (indeed, larger backgrounds are harder to search!).
The good outweighs the bad
In this section, we explore the ability of the signal in “good” channels to overcome the obfuscating effect of “bad” channels. To wit, consider Condition (6) with \(c_3>0\). We see that if there are enough channels (i.e., \(c_1\) is sufficiently large) with positive correlation (\(s_i,q_i<1/2\)), then the template and background remain matchable even in the presence of (potentially) multiple negativelycorrelated channels. We explore this further in the following experiment. As in the previous “Strength in numbers” section, we study the obfuscating effect of negativelycorrelated channels for both the \(m=n\) and \(m<n\) cases.
In Figure 5, we first consider \(n=m=100\) (left panel), and let \(\mathbf{G} ,\mathbf{H} \in \mathcal {M}^{10}_{100}\) (i.e., \(c=10\)), where for \(i\in [10]\) we have that \((G_i,H_i)\sim \mathrm {ER}(100,0.5,\rho )\). Under the same setting (right panel), we let \(m=100,\,n=500\) and we apply Naive Padding to \((\mathbf{G} ,\mathbf{H} )\in (\mathcal {M}^{10}_{100},\mathcal {M}^{10}_{500})\).
Considering \(\rho\) to be either \(\rho =r\) (for \(c_g\) channels) or \(\rho =r\) (for \(c_b=cc_g\) channels), where r varies in \(\{0.1,0.2,0.3,0.4,0.5\}\), we plot the matching accuracy (averaged over 2000 (left panel) and 100 (right panel) Monte Carlo replicates) obtained by MFAQ (with 10 seeds) versus \(c_b\) in Figure 5. For each choice of parameters, we also plot (via the partially transparent points) the accuracy achieved by each MC replicate. In red (resp., olive, green, blue, purple) we plot the results for \(r=0.1\) (resp., \(r=0.2\), \(r=0.3\), \(r=0.4\), \(r=0.5\)). From the figure, we see the expected relationship: matching at higher levels of \(\rho\) yields better accuracy, and more robustness to channels with negative correlation. Further, we notice that the matching accuracy in the right panel (i.e., \(m<n\)) is not as good as in the left panel (i.e., \(m=n\)). To see whether this is universal, or merely an artifact of the padding scheme, we explore the \(m<n\) case further in Figures 6 and 7.
To study the effect of different padding schemes (Naive vs Centered) in terms of the matching accuracy. We analyze the padding schemes’ effectiveness first, by varying the values of the correlation \(r\in \{0.3,0.4,0.5,0.6,0.7\}\) while keeping n, m constant (see Figure 6) and second, by varying the number of the background vertices \(n\in \{100,500,1000,2000\}\) while the template size \(m=100\) remains the same (see Figure 7). Using the Naive (resp. Centered) padding scheme, we consider matching \(\widetilde{\mathbf{G }}\) and \(\widetilde{\mathbf{H }}\) (resp. \((\widehat{\mathbf{G }}\) and \(\widehat{\mathbf{H }})\)) for \(c_b\) ranging over \(\{1,2,\ldots ,10\}\). Utilizing \(s=10\) seeds, we match using MFAQ; results are plotted in Figures 6 and 7. As in Figure 5, we plot the mean matching accuracy (i.e., the fraction of vertices whose latent alignment is recovered correctly) of MFAQ versus \(c_b\), averaged over 100 MC replicates. For each choice of parameters, we also plot (via the partially transparent points) the accuracy distribution corresponding to the MC replicates. In Figure 6, in red (resp., olive, green, blue, purple) we plot the results for \(r=0.3\) (resp., \(r=0.4\), \(r=0.5\), \(r=0.6\), \(r=0.7\)). In Figure 7, in red (resp., green, blue, purple) we plot the results for \(n=100\) (resp., \(n=500\), \(n=1000\), \(n=2000\)). From Figures 5 and 6, we observe that matching at higher levels of \(\rho\) yields better accuracy, and more robustness to channels with negative correlation even though the channels with negative correlation are more anticorrelated for higher r. Moreover, Figures 6 and 7 show that the Centered Padding scheme achieves better performance in terms of matching accuracy than the Naive Padding scheme, though the improvement is not as dramatic as that previously seen.
Experiments
Our previous simulation explored the effect on multiple channels on multiplex matchability. We next consider the performance of our multiplex matched filter approach in detecting a hidden template in a multilayer social media network from Magnani and Rossi (2011). The background network contains 3 aligned channels representing user activity in FriendFeed, Twitter and Youtube (where the Youtube and Twitter channels were generated via FriendFeed which aggregates user information across these platforms). In total, there are 6, 407 unique vertices across the three channels, with the channel specific networks satisfying:
Channel  Vertices  Edges 

FriendFeed  5,540  31,921 
5,702  42,327  
YouTube  663  614 
Given a 35 vertex multiplex template \(\mathbf{g}\) (embedded into the background) created by Pacific Northwest National Laboratories for the DARPA MAA program, we ran our MGMMF algorithm to attempt to recover the embedded template in \(\mathbf{h}\); results are summarized below.
In our first experiment, we first considered running “coldstart” MGMMF; that is, no prior information (in the form of seeds, hard or soft) is utilized in the algorithm. We consider padding the graph via the Naive Padding and Centered Padding regimes of “The multiplex graph matching distance” section, and for each padding regime, we ran MGMMF with \(N=100\) random restarts. Numeric results are summarized in Table 1 (with the best recovered background signals also plotted in Figure 8). While the best recovered signal in the Naive Padding regime captures all but two template edges, this is at the expense of many extraneous background edges that do not appear in the template. On the other hand, the Centered Padding regime recovers most of the template edges (across the three channels) with minimal extra template edges in the recovered signal.
The MFAQ algorithmic primitive used in our implementation of MGMMF is most effective when it can leverage a priori available matching data in the form of seeded vertices. Seeds can either come in the form of hard seeds (a priori known 1–to–1 matches; here that would translate to template vertices whose exact match is known in the background) or soft seeds (where a soft seeded vertex v in \(\mathbf{g}\) has an a priori known distribution over possible matches in \(\mathbf{h}\); here this would translate into template vertices with a list of candidate matches in the background). While hard seeds are costly and often unavailable in practice, there are many scalable procedures in the literature for automatically generating soft seed matches. Here, we use as a softseeding the output of Moorman et al. (2018, 2021), a filtering approach for finding all subgraphs of the background network homomorphic to the template.
For each node in the template, the output of Moorman et al. (2018, 2021) produces a multiset of candidate matches in the background, where each candidate match corresponds to a template copy contained in the background as a subgraph (not necessarily as an induced subgraph). We convert the candidate matches into probabilities by simply converting the multiset to a count vector and normalizing the count vector to sum to 1. We then consider the normalized count vectors as rows of a stochastic matrix; this stochastic matrix provides MFAQ with a softseeding/initialization which can be used to initialize the algorithm. Considering random restarts as perturbations (akin to Step 2.i of MGMMF as presented in “Multiplex graph matching matched filters” section) of the softseeding (conditioned on retaining nonnegative entries), we ran MGMMF using a generalization of the Centered Padding regime, which is defined as follows: For each \(i\in [c]\), define the weighted adjacency matrices \(\breve{A}_i\in \mathbb {R}^{m\times m}\) and \(\breve{B}_i\in \mathbb {R}^{n\times n}\) via
where we vary w from 0 to 1. Note that \(w=0\) yields Naive Padding, and \(w=1\) yields Centered Padding. Optimal performance in the present experiment was achieved with \(w=0.25\), in which case \(N=4000\) random restarts yielded an induced subgraph in the background that was exactly isomorphic to the template network.
MGMMF on semantic property graphs
For our second example, we consider the semantic property graph released by Pacific Northwest National Laboratories as part of the MAAAIDA Data Release V2.1.2 via the DARPA MAA program (Ebsch 2020). In this dataset, the background network is a knowledge graph constructed from a variety of documents (e.g., newspaper articles) by DARPA’s AIDA program. At a high level, the graph is encoding the realworld relationships between a variety of entities (people, locations, major events, etc.) that can be automatically extracted from a variety of data sources. Practically, the graph is a richly featured network on the order of 100K nodes. Node properties/features include entity name; rdf:type (corresponding to a structured ontology of types defined by the AIDA ontology); textValue; linkTarget; start time; among others. Edge properties/features include edge name/id; rdf:type; argument (values given to edges of a given rdf:type); among others. Note that many nodes and edges do not have values for all properties. The provided templates, themselves richly featured knowledge graphs, are of the order of 10s of vertices (ranging in size from 33 nodes/40 edges to 11 nodes/11 edges); for each of three template types, there are 6 variants with varying error levels, including one variant (version “A”) that is perfectly/isomorphically embedded into the background.
The principle challenge in applying our MGMMF methodology on such richly featured data is sensibly incorporating the rich, structured features into our multiplex network framework. Towards this end, we adopted the following approach. We incorporated the vertex features/properties into a penalty term in the objective function, encoding the features into a vertex–to–vertex similarity matrix S (this is possible provided that similarities are easily computed within each vertex covariate, which is the case here). Edge features were used to divide the knowledge graph into multiple overlapping channels in a multiplex network. One channel was assigned to each unique (E(rdf:type), E(argument)) pair in the template, and we divided background edges amongst the channels via a measure of similarity between the edge’s (E(rdf:type), E(argument)) pair and that of the channel. Note that this yields many, often overlapping, channels in the multiplex graph. Spatiotemporal constraints can be coded into separate channels in the multiplex graph, one channel per spatiotemporal constraint in the template. Each constraint (e.g., action A must occur between x and y days after action B) yields an edge filter, with only edges that could potentially satisfy the constraint being added to that constraint channel. Lastly, numeric edge features are used to further weight the edges in the background/template. The final objective function we used in our MGMMF approach was then of the form
Performance of MGMMF and other competing approaches (as scored by the GED scoring metric of Ebsch 2020) are presented in Figure 9 (Note that the filtering approach of Tu et al., as presented in Tu et al. (2020), also present a method that uses the clean “A” template for training, and achieves essentially the best score for all template versions; we did not include those scores for comparison in our figure).
Overall, the challenge was to detect noisy matches to each of 18 provided templates: 3 distinct template types, each with 6 variants encompassing different amounts of template noise; see Figure 10, for a pair of example templates and MGMMF recovered signals here. One template of each type was constructed to have a perfect isomorphic match in the background, while the noisy templates were designed for inexact/fuzzy/noisy matching. All approaches identified the isomorphic match in the background for all three template types version “A”, while our approach achieved its best relative results on the larger, more complex template (Template 1). For Template 1, we obtained the best (or nearly the best) score on 3/6 versions; see Table 2 for detail.
Templates 2 and 3 were essentially treelike structures (nodes/edges is 13/15 and 11/11 respectively), and we suspect that the filteringbased approaches of Tu et al. (2020) and Kopylov et al. (2020) are more suitable to this problem type. Indeed, MGMMF is designed for larger/more complex templates, though our performance (especially compared to the nonfiltering GFinder of Liu et al. 2019) is encouraging on these instantiations, especially on version “B” of the templates.
Discussion
In this paper, we presented a framework for finding noisy copies of a multiplex template in a large multiplex background network. Our strategy, which extends (Sussman et al. 2019) to the multiplex setting, uses a novel formulation of multiplex graph matching combined with multiple random restarts to search the background for locally optimal matches to the template. The effectiveness of the resulting algorithm, named MGMMF, is further demonstrated both theoretically (in a novel multiplex correlated network framework) and empirically on both real and simulated data sets. To formalize the MGMMF approach, we provide a very natural extension of the classical graph matching problem to multiplex networks that is easily amended to matching graphs of different orders (both across networks and channels). We also extend the FrankWolfe based FAQ algorithm of Vogelstein et al. (2014) to the multiplex setting.
There are a number of extensions and open questions that arose during the course of this work. Natural theoretic extensions include lifting Theorems 1 and 2 to nonedge independent models (note that certain localized dependencies amongst edges can easily be handled in the McDiarmind proof framework, while globally dependent errors provide a more significant challenge); formulating the analogues of Theorems 1 and 2 in the weighted, attributed graph settings; and considering the theoretic properties of various continuous relaxations of the multiplex GM problem akin to Aflalo et al. (2015), Lyzinski et al. (2015), Bento and Ioannidis (2019). A key methodological question in multiplex graph matching was touched upon in Remark 3; indeed, we expect the question of how to weight the matching across channels to be essential when applying these methods to topologically diverse and weighted networks. If the order of magnitude of edge weights vary across channel, then it is easy to see a GM algorithm aligning channels with large edge weights at the expense of the alignment accuracy in other channels. Judiciously choosing \((\lambda _i)\) would allow for the signal in channels with smaller edge weights to be better leveraged towards a better overall matching.
While the largest network we consider in this work has \(\approx 100,000\) vertices, scaling this approach to very large networks is essential. By utilizing efficient data structures for sparse, lowrank matrices and a clever implementation of the LAP subroutine of \(\texttt {MFAQ}\) (step 2.ii. in the presented MGMMF algorithm), we are able to match O(10) vertex templates to 20Kvertex background graphs in \(< 10s\) per restart with our base MGMMF code (available in iGraphMatch) implemented in R on a standard laptop. Further work to scale MGMMF by leveraging both efficient data structures and scalable approximate LAP solvers is currently underway.
Availability of data and materials
The background graphs for the 3channel social network in “Experiments” section is available at http://multilayer.it.uu.se/datasets.html. The data from the DARPA MAA analysis Section is not publicly available, and the obtained results for the outside algorithms appear in the cited papers in the literature. The code implementing the MGMMF and MFAQ procedures can be downloaded as part of our R package, iGraphMatch, which is available on CRAN or can be downloaded at https://github.com/dpmcsuss/iGraphMatch.
Abbreviations
 GM(P):

Graph Matching (Problem)
 nMGMP:

Naive Multiplex Graph Matching Problem
 cMGMP:

Centered Multiplex Graph Matching Problem
 MGMMF:

Multiplex Graph Matching Matched Filters
 MFAQ:

Multiplex Fast Approximate Quadratic graph matching algorithm
 ME:

(Single Channel Source) Multiplex Error
 MS:

(Single Channel Errors) Multiplex Source
 GED:

Graph Edit Distance
 LAP:

Linear Assignment Program
 DARPA:

Defense Advanced Research Projects Agency
 MAA:

Modeling Adversarial Activity DARPA program
 AIDA:

Active Interpretation of Disparate Alternatives DARPA program
References
Aflalo Y, Bronstein A, Kimmel R (2015) On convex relaxation of graph isomorphism. Proc Natl Acad Sci 112(10):2942–2947
Alon N, Dao P, Hajirasouliha I, Hormozdiari F, Sahinalp SC (2008) Biomolecular network motif counting and discovery by color coding. Bioinformatics 24(13):i241–i249
Alon N, Yuster R, Zwick U (1995) Colorcoding. J ACM (JACM) 42(4):844–856
Arroyo J, Sussman DL, Priebe CE, Lyzinski V (2018) Maximum likelihood estimation and graph matching in errorfully observed networks. arXiv preprint arXiv:1812.10519
Barak B, Chou C, Lei Z, Schramm T, Sheng Y (2018) (nearly) efficient algorithms for the graph matching problem on correlated random graphs. arXiv preprint arXiv:1805.02349
Bento J, Ioannidis S (2019) A family of tractable graph metrics. Appl Netw Sci 4(1):1–27
Boccaletti S, Bianconi G, Criado R, Del Genio CI, GómezGardenes J, Romance M, SendinaNadal I, Wang Z, Zanin M (2014) The structure and dynamics of multilayer networks. Phys Rep 544(1):1–122
Chan SC, Cheney J (2020) Flexible graph matching and graph edit distance using answer set programming. In: International symposium on practical aspects of declarative languages, pp 20–36. Springer
Chaudhuri S, Chatterjee S, Katz N, Nelson M, Goldbaum M (1989) Detection of blood vessels in retinal images using twodimensional matched filters. IEEE Trans Med Imaging 8(3):263–269
Chen L, Vogelstein JT, Lyzinski V, Priebe CE (2016) A joint graph inference case study: the C. elegans chemical and electrical connectomes. Worm 5(2)
Conte D, Foggia P, Sansone C, Vento M (2004) Thirty years of graph matching in pattern recognition. Int J Pattern Recognit Artif Intell 18(03):265–298
Cordella LP, Foggia P, Sansone C, Vento M (2004) A (sub) graph isomorphism algorithm for matching large graphs. IEEE Trans Pattern Anal Mach Intell 26(10):1367–1372
Cullina D, Kiyavash N (2016) Improved achievability and converse bounds for erdosrényi graph matching. In: ACM SIGMETRICS performance evaluation review, 44(1), pp 63–72. ACM
Cullina D, Kiyavash N (2017) Exact alignment recovery for correlated ErdosRenyi graphs. arXiv preprint arXiv:1711.06783
Cullina D, Kiyavash N, Mittal P, Poor HV (2018) Partial recovery of ErdosRenyi graph alignment via \(k\)core alignment. arXiv preprint arXiv:1809.03553
DePiero F, Krout D (2003) An algorithm using lengthr paths to approximate subgraph isomorphism. Pattern Recognit Lett 24(1–3):33–46
Ding J, Ma Z, Wu Y, Xu J (2018) Efficient random graph matching via degree profiles. arXiv preprint arXiv:1811.07821
Du B, Zhang S, Cao N, Tong H (2017) First: Fast interactive attributed subgraph matching. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1447–1456
Du X, Yan J, Zhang R, Zha H (2020) Crossnetwork skipgram embedding for joint network alignment and link prediction. IEEE Trans Knowl Data Eng
Ebsch CL, Cottam JA, Heller NC, Deshmukh RD, Chin G (2020) Using graph edit distance for noisy subgraph matching of semantic property graphs. In: 2020 IEEE international conference on big data (big data), pp 2520–2525. IEEE
Egozi A, Keller Y, Guterman H (2012) A probabilistic approach to spectral graph matching. IEEE Trans Pattern Anal Mach Intell 35(1):18–27
EmmertStreib F, Dehmer M, Shi Y (2016) Fifty years of graph matching, network alignment and network comparison. Inf Sci 346–347:180–197
Fan Z, Mao C, Wu Y, Xu J (2020) Spectral graph matching and regularized quadratic relaxations: algorithm and theory. In: International conference on machine learning, pp 2985–2995. PMLR
Fang F, Sussman DL, Lyzinski V (2018) Tractable graph matching via soft seeding. arXiv preprint arXiv:1807.09299
Feizi S, Quon G, Medard M, Kellis M, Jadbabaie A (2015) Spectral alignment of networks
Fishkind DE, Lyzinski V, Pao H, Chen L, Priebe CE (2015) Vertex nomination schemes for membership prediction. Ann Appl Stat 9(3):1510–1532
Fishkind DE, Adali S, Patsolic HG, Meng L, Singh D, Lyzinski V, Priebe CE (2019) Seeded graph matching. Pattern Recognit 87:203–215
Foggia P, Percannella G, Vento M (2014) Graph matching and learning in pattern recognition in the last 10 years. Int J Pattern Recognit Artif Intell 28(01):1450001
Fomin FV, Lokshtanov D, Raman V, Saurabh S, Rao BVR (2012) Faster algorithms for finding and counting subgraphs. J Comput Syst Sci 78(3):698–706
Frank M, Wolfe P (1956) An algorithm for quadratic programming. Nav Res Logist Q 3(1–2):95–110
Goga O, Loiseau P, Sommer R, Teixeira R, Gummadi KP (2015) On the reliability of profile matching across large online social networks. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1799–1808. ACM
Heimann M, Shen H, Safavi T, Koutra D (2018) Regal: representation learningbased graph alignment. In: Proceedings of the 27th ACM international conference on information and knowledge management, pp 117–126
Jonker R, Volgenant A (1987) A shortest augmenting path algorithm for dense and sparse linear assignment problems. Computing 38(4):325–340
Kazemi E, Hassani SH, Grossglauser M (2015a) Growing a graph matching from a handful of seeds. Proc VLDB Endow 8(10):1010–1021
Kazemi E, Yartseva L, Grossglauser M (2015b) When can two unlabeled networks be aligned under partial overlap? In: 2015 53rd Annual allerton conference on communication, control, and computing (Allerton), pp 33–42. IEEE
Kiar G, Bridgeford EW, Roncal WRG, Chandrashekhar V, Mhembere D, Ryman S, Zuo X, Margulies DS, Craddock RC, Priebe CE, Jung R, Calhoun VD, Caffo B, Burns R, Milham MP, Vogelstein JT (2018) A highthroughput pipeline identifies robust connectomes but troublesome variability. bioRxiv, p 188706
Kivelä M, Arenas A, Barthelemy M, Gleeson JP, Moreno Y, Porter MA (2014) Multilayer networks. J Complex Netw 2(3):203–271
Kivelä M, Porter MA (2017) Isomorphisms in multilayer networks. IEEE Trans Netw Sci Eng
Klau GW (2009) A new graphbased method for pairwise global network alignment. BMC Bioinf 10(1):S59
Kopylov A, Xu J (2019) Filtering strategies for inexact subgraph matching on noisy multiplex networks. In: 2019 IEEE international conference on big data (big data), pp 4906–4912. IEEE
Kopylov A, Xu J, Ni K, Roach S, Lu TC (2020) Semantic guided filtering strategy for besteffort subgraph matching in knowledge graphs. In: 2020 IEEE international conference on big data (big data), pp 2539–2545. IEEE
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25
Larrosa J, Valiente G (2002) Constraint satisfaction algorithms for graph pattern matching. Math Struct Comput Sci 12(4):403–422
Lee J, Han W, Kasperovics R, Lee J. (2012) An indepth comparison of subgraph isomorphism algorithms in graph databases. Proc VLDB Endow 6(2):133–144
Liu L, Du B, Tong H (2019) Gfinder: approximate attributed subgraph matching. In: 2019 IEEE international conference on big data (big data), pp 513–522. IEEE
Lladós J, Martí E, Villanueva JJ (2001) Symbol recognition by errortolerant subgraph matching between region adjacency graphs. IEEE Trans Pattern Anal Mach Intell 23(10):1137–1143
Lyzinski V, Fishkind D, Fiori M, Vogelstein JT, Priebe CE, Sapiro G (2015) Graph matching: relax at your own risk. IEEE Trans Pattern Anal Mach Intell 38(1):60–73
Lyzinski V, Sussman DL (2020) Matchability of heterogeneous networks pairs. Inf Inference A J IMA 9(4):749–783
Magnani M, Rossi L (2011) The mlmodel for multilayer social networks. In: 2011 International conference on advances in social networks analysis and mining, pp 5–12. IEEE
McDiarmid C (1989) On the method of bounded differences. Surv Combin 141(1):148–188
Moorman JD, Chen Q, Tu TK, Boyd ZM, Bertozzi AL (2018) Filtering methods for subgraph matching on multiplex networks. In: 2018 IEEE international conference on big data (big data), pp 3980–3985. IEEE
Moorman JD, Tu TK, Chen Q, He X, Bertozzi AL (2021) Subgraph matching on multiplex networks. IEEE Trans Netw Sci Eng 8(2):1367–1384
Mucha PJ, Richardson T, Macon K, Porter MA, Onnela J (2010) Community structure in timedependent, multiscale, and multiplex networks. Science 328(5980):876–878
Ng MK, Li X, Ye Y (2011) Multirank: coranking for objects and relations in multirelational data. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1217–1225. ACM
Onaran E, Garg S, Erkip E (2016) Optimal deanonymization in random graphs with community structure. In: 2016 50th Asilomar conference on signals, systems and computers, pp 709–713. IEEE
O’Shea K, Nash R (2015) An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458
Patsolic H, Adali S, Vogelstein JT, Park Y, Priebe CE, Li G, and Lyzinski V (2014) Seeded graph matching via joint optimization of fidelity and commensurability. arXiv preprint arXiv:1401.3813
Pedarsani P, Grossglauser M (2011) On the privacy of anonymized networks. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1235–1243. ACM
Pinheiro MA, Kybic J, Fua P (2016) Geometric graph matching using Monte Carlo tree search. IEEE Trans Pattern Anal Mach Intell 39(11):2171–2185
Priebe CE, Sussman DL, Tang M, Vogelstein JT (2015) Statistical inference on errorfully observed graphs. J Comput Graph Stat 24(4):930–953
Singh R, Xu J, Berger B (2008) Global alignment of multiple protein interaction networks with application to functional orthology detection. Proc Natl Acad Sci 105(35):12763–12768
Sussman DL, Park Y, Priebe CE, Lyzinski V (2019) Matched filters for noisy induced subgraph detection. IEEE Trans Pattern Anal Mach Intell 42(11):2887–2900
Takes FW, Kosters WA, Witte B (2017) Detecting motifs in multiplex corporate networks. In: International workshop on complex networks and their applications, pp 502–515. Springer
Tu TK, Moorman JD, Yang D, Chen Q, Bertozzi AL (2020) Inexact attributed subgraph matching. In: 2020 IEEE international conference on big data (big data), pp 2575–2582. IEEE
Ullmann JR (1976) An algorithm for subgraph isomorphism. J ACM (JACM) 23(1):31–42
Umeyama S (1988) An eigendecomposition approach to weighted graph matching problems. IEEE Transactions on Pattern Anal Mach Intell 10(5):695–703
Vogelstein JT, Conroy JM, Lyzinski V, Podrazik LJ, Kratzer SG, Harley ET, Fishkind DE, Vogelstein RT, Priebe CE (2014) Fast approximate quadratic programming for graph matching. PLoS ONE 10(4):1
White JG, Southgate E, Thomson JN, Brenner S (1986) The structure of the nervous system of the nematode caenorhabditis elegans. Philos Trans R Soc Lond B Biol Sci 314(1165):1–340
Yan J, Yin X, Lin W, Deng C, Zha H, Yang X (2016) A short survey of recent advances in graph matching. In: Proceedings of the 2016 ACM on international conference on multimedia retrieval, pp 167–174. ACM
Yang B, Liu J (2016) Mining multiplex structural patterns from complex networks. In: Wisdom Web of Things, pp 275–301. Springer
Zampelli S, Deville Y, Solnon C (2010) Solving subgraph isomorphism problems with constraint programming. Constraints 15(3):327–353
Zanfir A, Sminchisescu C (2018) Deep learning of graph matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2684–2693
Zaslavskiy M, Bach F, Vert JP (2009) A path following algorithm for the graph matching problem. IEEE Trans on Pattern Anal Mach Intell 31(12):2227–2242
Zhang S, Tong H (2016) Final: Fast attributed network alignment. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1345–1354
Zhang Z, Xiang Y, Wu L, Xue B, Nehorai A (2019) Kergm: Kernelized graph matching. Adv Neural Inf Proces Syst 32
Zhou F, De la Torre F (2012) Factorized graph matching. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 127–134. IEEE
Zuo XN, Anderson JS, Bellec P, Birn RM, Biswal BB, Blautzik J, Breitner J, Buckner RL, Calhoun VD, Castellanos FX et al (2014) An open science resource for establishing reliability and reproducibility in functional connectomics. Sci Data 1(1):1–13
Acknowledgements
Not applicable.
Funding
Dr. Sussman’s contribution to this work was partially supported by a grant from MIT Lincoln Labs and the Department of Defense. This material is based on research sponsored by the Air Force Research Laboratory and DARPA under agreement numbers FA87501820035 and FA87502021001. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Air Force Research Laboratory and DARPA or the U.S. Government.
Author information
Authors and Affiliations
Contributions
Daniel L. Sussman, Carey E. Priebe and Vince Lyzinski, and Konstantinos Pantazis conceived of the method and developed the theory. Konstantinos Pantazis, Youngser Park, Daniel L. Sussman, Vince Lyzinski and Zhirui Li worked on writing and implementing the algorithm and performing relevant experiments. Konstantinos Pantazis, Daniel L. Sussman and Vince Lyzinski wrote and curated the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Herein we collect details of our auxiliary algorithms and proofs of our main results.
Proof of Theorem 1
For each \(m<n\), denote the set of permutations in \(\Pi _n\) that permute exactly k labels of [m] by \(\Pi _{n,m,k}\). For each \(P\in \Pi _n\) (with associated permutation \(\sigma _p\)), define
and for each \(i\in [c]\), define
Let the set of the \(c_1\) lownoise channels (\(q<1/2\)) be denoted \(\mathcal {C}_1\), and the set of the \(c_3\) highnoise channels (\(q>1/2\)) be denoted \(\mathcal {C}_3\). If there exists an \(n_1>0\) such that for all \(n>n_1\), we have that for all \(k\in \{1,2,3,\ldots ,m(n)\}\) and all \(P\in \Pi _{n,m,k}\),
then letting \(\widehat{\mathbf{A }}\) and \(\widehat{\mathbf{B }}\) be the padded, centered adjacency matrices of \(\mathbf{G}\) and \(\mathbf{H}\) respectively (the errorful \(\mathbf{g}\) and \(\mathbf{h}\)), for \(n>\mathfrak {n}=\max (n_0,n_1)\) we have that
(see “Appendix 1.1.1” for proof of this bound).
While we expect Eq. (14) to hold in more general settings, below we show it holds with high probability in the present ErdősRényi setting with our given theorem assumptions. Here, \(\Delta _P^{(i,2)}\) is empty, and a simple application of McDiarmid’s inequality yields that for a fixed \(P\in \Pi _{n,m,k}\), we have
with probability at least
Indeed, if each \(h_i\sim\) ER(n, p), then
so that \(\mathbb {E}(\Delta _P^{(i,1)})=2p(1p)\Delta _P\). Also, \(\Delta _P^{(i,1)}\) is then a function of at most \(2\Delta _P\) independent Bernoulli random variables, and changing the value of any one these can change the value of \(\Delta _P^{(i,1)}\) by at most 2. McDiarmid’s inequality then yields the desired result
by setting \(t= p(1p)\Delta _P.\)
Note that if \(m>6\), then as \(P\in \Pi _{n,m,k}\), \(mk/3\le \Delta _P\le mk\). Therefore, if \(m>6\) then for n sufficiently large, we have that with probability at least (letting \(\xi\) in Assumption (5) be appropriately defined)
the event
occurs. Now, we have that
where the second inequality holds by assumption. We see then that Eq. (14) holds with high probability with an appropriate choice of \(\gamma\) (independent of k).
To finish the proof, we proceed as follows. For each \(P\in \Pi _n\) define
Let the equivalence relation “\(\sim\)” on \(\Pi _{n,m,k}\) be defined via \(P\sim Q\) if there exists a \(U\in \mathcal {P}_{n,m}\) such that \(PU=Q\). Note that if \(P\sim Q\) then
and so \(X_P=X_Q\). Let \(\Pi _{n,m,k}^*\) be a fixed (but arbitrarily chosen) set composed of one member of each equivalence class according to “\(\sim\),” and note that \(\Pi _{n,m,k}^*\) is at most \(m^{2k}n^{2k}\). From Eq. (18), for \(n>n_2\) we have that for each \(P\in \Pi _{n,m,k}^*\),
Denote the event bound in Eq. (20) via \(\mathcal {E}_{n,P}\). For \(n>\max (n_2,n_0)\), we then have
as desired.
Proof of Eq. (15)
Recall, for each \(P\in \Pi _n\), we defined
Assuming that \(P\in \Pi _{n,m,k}\), then \(\Delta _P\le mk\). Note that \(X_P\) is a function of (at most) \(3c \Delta _P\) independent Bernoulli random variables, and changing any one of these Bernoulli random variables can change the value of \(X_P\) by at most 8. McDiarmid’s inequality (McDiarmid 1989) then implies that for any \(t\ge 0\),
Note that if \(\widehat{C}_i(j,k), \widehat{D}_i(j,k), \widehat{D}_i(\sigma _p(j),\sigma _p(k))\in \{1,1\}\) then
Define
so that \(\Delta _P^{(i,0)}=\Delta _P^{(i,1)}+\Delta _P^{(i,2)}+\Delta _P^{(i,3)}\). We then have
Note that if \(P,Q\in \Pi _{n,m,k}\), then \(X_P=X_Q\) if \(\sigma _p(j)=\sigma _q(j)\) for all \(j\in [m]\); i.e., if there exists a \(U\in \mathcal {P}_{m,n}\) such that \(PU=Q\). Note that this defines an equivalence relation on \(P,Q\in \Pi _{n,m,k}\) which we will denote by “\(\sim\),” and let \(\Pi _{n,m,k}^*\) be a fixed (but arbitrarily chosen) set composed of one member of each equivalence class according to “\(\sim\).” Note that \(\Pi _{n,m,k}^*\) is at most \(m^{2k}n^{2k}\). Letting \(t=\mathbb {E}(X_P)\) in Eq. (21), we have that if \(n>\mathfrak {n}=\max (n_0,n_1)\)
as desired.
Proof of Theorem 2
For each \(P\in \Pi (n)\), define
where \(\Delta _P\) is defined as in Eq. (13). Suppose that there exists an \(n_1>0\) such that for all \(n>n_1\), we have that for all \(k\in [m=m(n)]\) and all \(P\in \Pi _{n,m,k}\) (where \(\gamma\) is an appropriately chosen constant)
then
where the proof of the bound in Eq. (24) uses “Appendix 1.3” and then the proof follows mutatis mutandis from the proof in “Appendix 1.1.1”.
In the present ER setting, as in Eq. (16), we then have that for \(j=1,2\),
with probability at least
Note that if \(m>6\), then \(mk/3\le \Delta _P\le mk\), so that with probability at least \(14e^{8\log n}\) (this lower bounds Eq. (25) for judiciously chosen \(\xi\) in Eq. (7),
for both \(j=1,2\). To show Eq. (23) holds here with sufficiently high probability, note that the above bounds yield
where the last inequality follows from Eq. (8). Proving Eq. (9) then follows mutatis mtuandis to the proof of the analogue in Theorem 1 (which is spelled out in detail in “Appendix 1.1”), and details are omitted.
Proof details for Eq. (24)
For \(P\in \Pi _n\), define
For \(P\in \Pi _{n,m,k}\), we then have that \(X_P\) defined in Eq. (19) satisfies
Showing that this yields the desired bound combines Eq. (23) with the (nearly identical) proof technique of “Appendix 1.1.1”; details are omitted.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Pantazis, K., Sussman, D.L., Park, Y. et al. Multiplex graph matching matched filters. Appl Netw Sci 7, 29 (2022). https://doi.org/10.1007/s41109022004640
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41109022004640
Keywords
 Multiplex graphs
 Graph matching
 Correlation network models
 Matched filters