 Research
 Open Access
 Published:
Bipattern mining of attributed networks
Applied Network Sciencevolume 4, Article number: 37 (2019)
Abstract
Applying closed pattern mining to attributed twomode networks requires two conditions. First, as in twomode networks there are two kinds of vertices, each described with a proper attribute set, we have to consider patterns made of two components that we call bipatterns. The occurrences of a bipattern forms an extension made of a pair of vertex subsets. Second, Formal Concept Analysis and Closed Pattern Mining were recently applied to networks by reducing the extensions of pattern to their cores, according to some core definition. We need to consider appropriate core definitions for twomode networks and define accordingly closed bipatterns. We describe in this article a general framework to define closed bipattern mining. We also show that this methodology applies as well to cores of directed and undirected networks in which each vertex subset is associated with a specific role. We illustrate the methodology first on a twomode network of epistemological data, then on a directed advice network of lawyers and finally on an undirected bibliographical network.
Introduction
The first motivation of this article is to extend the Closed Pattern Mining (CPM) and Formal Concept Analysis (FCA) methodologies in order to investigate attributed twomode networks. Note that there is no difference between the two methodologies in that they enumerate the same closed patterns, however FCA is also interested in the structure of this result as a conceptual structure. The present work follows previous work in which CPM and FCA were applied to undirected and directed graphs. In what follows we recall the notions which CPM of attributed networks rely on. Then we also discuss the necessity of defining bipatterns in order to mine twomode networks.
Most of the work in social and complex networks analysis consider unlabelled and undirected networks and is concerned by what may be said about the topological structure of the network. Various ways have been proposed to extract interesting subgraphs. In particular in the coreperiphery model the network is made of a core subgraph, i.e. a dense subgraph whose vertices are highly connected, together with its periphery, made of vertices highly connected to the core, but poorly interconnected (Borgatti and Everett 2000). The first formal core definition was the kcore subgraph which is the greatest subnetwork whose vertices all have degree at least k in the subnetwork (Seidman 1983). By changing the topological property we obtain various core definitions within the generalized cores framework proposed by V. Batagelj (Batagelj and Zaversnik 2011).
Various recent work on complex networks analysis take into account information provided as labels about vertices or edges. The network is then called a labelled or attributed network. Recently an approach has been presented extending CPM and FCA to mine attributed graphs. For that purpose, the vertex subset in which an attribute pattern occurs is reduced to its core subset using some interior operator (Soldano and Santini 2014). Applying interior operators to compute closed patterns make them abstract closed pattern for which enumeration algorithms exists (Soldano and Ventos 2011). They are called core closed pattern when this methodology rely on core definitions (Soldano and Santini 2014; Soldano et al. 2017a).
Now, twomode networks are made of two vertex sets representing in general two kind of entities, for instance actors and movies, together with edge relating entities of each kind, as for instance “G. Clooney acted in Ocean’s Eleven”. Until recently they were mostly investigated by extracting single mode networks, relating for instance actors to actors who participated to the same movies. However in (Borgatti and Everett 1997) the authors advocated the direct investigation of twomode networks, and a core definition for twomode networks have been recently proposed by Cerinsek and Batagelj (Cerinsek and Batagelj 2015). However applying core closed pattern mining to such twomode networks requires to extend the methodology. The difficulty is that when such a network is attributed each kind of vertex is described according to a proper attribute set. This means that we have to consider patterns made of two attribute subsets, we further call bipatterns, that each selects two interconnected vertex subsets we call its support set pair. This allows for instance to require actors to be American and movies to be recent, but only consider vertices of a subnetwork in which each actor played in at least 2 movies and each movie is linked to at least 3 actors. Interestingly, such bipatterns may also be defined in the directed case when considering subgraphs in which a single pattern is associated to each of the in or out vertex roles. Finally we will see that the methodology we propose may also apply to undirected networks as far as we may dynamically define two different roles in the network, namely here considering in one hand high degree nodes and in the other hand their neighbours.
Note that in oder to properly define bipattern mining we also need to extract cores from subgraphs induced by vertex subset pairs. This also means defining cores made of two vertex subsets, which goes beyond generalized cores definition.
On the computational side, we adapt the general core extraction algorithm for our new core definitions and we propose a closed bipattern enumeration algorithm that we have implemented within the minerLC software^{Footnote 1}. We have experimented the resulting program on three networks. The first network is an epistemological twomode network relating deep sea exploration campaigns to their participants (Bary 2018). The second network is a lawyers network in which directed links represents lawyers asking for advice from other lawyers (Lazega 2001) that was previously used to illustrate closed pattern mining of attributed directed networks (Soldano et al. 2017b). The third one is an undirected coauthoring bibliographical network investigated in (Galbrun et al. 2014).
Finally, there may be a large number of bipatterns to extract from directed and undirected networks, when compared to single patterns: any pair of core closed single patterns is a candidate to be a core closed bipattern. We will propose to focus on bipatterns in which the two components, which are expressed in the same pattern language, are different enough. For that purpose we define a homogeneity measure and select inhomogenous bipatterns.
This work was presented in a workshop article (Soldano et al. 2018) in which the bipattern methodology were first introduced for twomode and directed networks. The present article also introduces the starsatellite core definition for undirected networks and discuss the bipatterns extracted and selected from a bibliographical network, exhibiting in particular some cooperation and competition examples in the pattern mining research domain. Overall, the main contributions of this work may be summarized as follows:

A general definition of closed bipattern mining.

A general algorithm for closed bipatterns enumeration

A new definition of the core of a network as a pair of vertex subsets

A general algorithm to extract such new cores

A definition of homogeneity for bipatterns.

The methodology of core closed bipattern mining of attributed networks, including core definitions designed respectively for twomode, directed and undirected networks.
“Related work” section discusses related work. “Preliminaries” section gives preliminary definitions and results on core Closed Pattern Mining. In “Biconcept lattices and abstract closed bipatterns” section we introduce abstract closed bipattern mining and abstract biconcept lattices. In “Cores as subset pairs and core closed bipattern mining” section we extend the definition of cores in order to obtain twocomponent cores and consequently define core closed bipattern mining. In “Core definitions: twoMode, directed and undirected networks” section we introduce such two components cores for twomode, directed and undirected networks. In “Computing the interior of (X_{1},X_{2}) and enumerating abstract closed bipatterns” section we provide algorithms to compute two component cores and to enumerate the associated closed bipatterns. Finally, in “Experiments” section we present the results obtained on the three networks mentioned above and discuss the scalability of this pattern mining methodology.
Related work
Analyzing attributed graphs led to various ways of extracting cohesize subgraphs. First, various pattern mining work investigated mining patterns as pairs of constraints on topology and labels, and rank them according to interestingness measures (Mougel et al. 2012; Silva et al. 2012). This includes abstract closed pattern mining mentioned above as well as work coming from the subgroup discovery field in which selection and pruning of interesting patterns is performed during enumeration(Atzmueller et al. 2016). A second way consists in extending community detection algorithms by taking into account both topology and attribute information. Various definition of hybrid objective functions and efficient ways to find optimal solutions have been proposed. In most case the result is a set of non overlapping communities (Baroni et al. 2017; Sánchez et al. 2015; Combe et al. 2015). The overlapping case has been addressed by soft clustering schemes (Xu et al. 2012), by hard clustering of the edge set (Galbrun et al. 2014) or by building generative models in such a way that a node may freely belong to several communities (Yang et al. 2013). Finally, network embedding algorithms have been proposed to learn an appropriate representation of nodes as vectors, and then apply standard clustering methods (Gao and Huang 2018).
In all these approaches, when considering the relationship between attributes and nodes, the latter have a unique role. This is obviously not appropriate regarding twomode networks, while in single mode network allowing nodes to have different roles within a group may lead to a more flexible way to define cohesive subnetworks. What we propose here, beyond the extension of the core closed pattern methodology to bipatterns, is a first step in revisiting the various methodologies mentioned above.
Regarding core definitions, recent work have proposed definitions designed to investigate directed networks. In particular a core definition has been proposed in Giatsidis et al. (2013) to investigate collaboration within directed networks. The requirement is then that both indegrees and outdegrees of vertices have to be higher than thresholds, therefore all nodes in the core are required to have both the out and in role. A different kind of core is related to the HubAuthority idea which considers that a vertex may be prominent in a network according to only one or both of its out or in roles (Kleinberg 1999). The HAcore has been recently defined in order to express this idea (Soldano et al. 2017b).
Preliminaries
Abstract closed pattern mining and concept lattices
The closed pattern mining and Formal Concept Analysis (FCA) frameworks consider the occurrences of patterns in a set of objects V. The pattern language L is partially ordered in such a way that if q^{′}≥q, i.e q^{′} is more specific than q, then whenever q^{′}occurs in object v, q also occurs in v. The set of occurrences ext(q) of a pattern q, i.e. the object subset in which q occurs, is called its support set or its extension in V. The purpose common to Closed Pattern Mining and FCA is then to represent, in a condensed way, the set of definable subsets of V, i.e. subsets which are pattern support sets.
Enumerating the definable subsets of V comes down to enumerate the equivalence classes of patterns when considering as equivalent two patterns with same support set. Whenever the pattern language is a finite lattice there is a unique most specific pattern in each class. Recall that a lattice is such that any pair a,b of elements have both a join a∨b and a meet a∧b. The meet a∧b is the unique greatest lower bound of a and b, i.e. a∧b≤a,a∧b≤b and there is no c>a∧b which is a lower bound of both a and b. In a dual way the join a∨b is the least upper bound of a and b. When considering an equivalence class of patterns, the most specific element of the class is then the meet of all its elements. This most specific pattern represents then what is common to all the patterns that occur in exactly the same object subset.
In the case of powersets, the order is the inclusion order ⊆ and join and meet respectively are set theoretical union and intersection. In standard FCA the pattern language L is the powerset 2^{I} of a set I of binary attributes and the extensional space is the powerset 2^{V} of the object set V. However, for our purpose of defining and mining bipatterns we need a more general presentation. First, we define below closure operators together with their dual interior operators.
Definition 1
Let S be an ordered set and f:S→S a self map such that for any x,y∈S, f is monotone, i.e. x≤y implies f(x)≤f(y) and idempotent, i.e. f(f(x))=f(x), then if f(x)≥x, f is called a closure operator while if f(x)≤x, f is called an interior operator.
Formal Concept Analysis goes beyond enumeration of closed patterns: FCA considers knowledge discovery as the process of discovering the ordering structure of the data to analyse. It relies primarily on the Galois connection^{Footnote 2} between the pattern language and the powerset of objects:
Proposition 1
Let (L,≤) be a lattice called the pattern language, V be a set of objects and d:V→L be an operator that describes the object x as an element d(x) of L. Let ext(q)={x∈V∣q≤d(x)} be the subset of objects in which pattern q occurs. Then

\(\text {int}(V^{\prime })= \bigwedge _{x \in V^{\prime }} d(x)\) is the greatest element of L which occurs in V^{′}

(int,ext) define a Galois connection on (2^{V},L)
In what follows we will use interior operators to define the general framework of abstract concept lattices. First we recall a general result (Pernelle et al. 2002; Soldano and Ventos 2011) together with a corollary defining abstract concept lattices:
Proposition 2
Let X and L be two lattices, (int,ext) be a Galois connection on (X,L) and p be an interior operator on X. Let A=p[X] be the image of X under p, then (int,p∘ext) is a Galois connection on (A,L).
Corollary 1
i) f=int∘p∘ext is a closure operator on Lii) h=p∘ext∘int is a closure operator on Aiii) The set of the (e,c) pairs where c=f(c)=int(e) and e=h(e)=p∘ext(c) form a lattice, ordered following A.
Such a pair (e,c) is called a concept, e is its (abstract) extent while c is its intent i.e. the abstract closed pattern whose abstract support set p∘ext(c) is e. As the new equivalence relation is coarser, i.e. ext(q)=ext(q^{′}) implies p∘ext(q)=p∘ext(q^{′}), there is less abstract closed patterns than closed patterns.
Abstract closed pattern mining is illustrated in Example 5 of Appendix 2.
Cores and closed pattern mining of attributed networks
Now, consider the object set as the vertex set V of some graph whose vertices are each labelled by a description in a pattern language. Defining the essential part of a graph, i.e. its core subgraph, relies on all vertices satisfying some boolean property. Let G=(V,E) be a graph. A core property P is defined as a mapping P:V×2^{V}→{true,false} where P(v,X) is true whenever vertex v satisfies some condition within the subgraph G_{X} induced by the vertex subset X. The core subgraph of a graph (V,E) is then defined as the subgraph \(G_{V^{\prime }}\) induced by the largest vertex subset V^{′}, also called its core, whose vertices v all have property P(v,V^{′}).
To define a core, we need P to be such that there does exist such a largest vertex subset with property P. This is true whenever P is monotone i.e. for any x∈X_{1}⊆X we have that P(x,X_{1}) and X_{2}⊇X_{1} implies P(x,X_{2}) (Batagelj and Zaversnik 2011; Soldano and Santini 2014). The following result allows then to apply abstract FCA to graphs:
Proposition 3
The operator that reduces a vertex subset V^{′} of a graph G to the core of the subgraph \(G_{V^{\prime }}\) is an interior operator on 2^{V}.
As a result, abstract concept lattices together with closure operators are defined in such a way that each extent p∘ext(c) is a core while the associated intent c is the most specific pattern that occurs in this core. Abstract closed pattern mining has been applied to undirected networks (Soldano et al. 2017a) as well as directed networks (Soldano et al. 2017b).
Example 1
We consider the small attributed graph displayed Fig. 1 and the 2core property that states that in a core subgraph all vertices have degree at least 2. We have then that the support set ext(a) of pattern a is 123457. The pattern a 2core is then 123: when adding to 123 any vertex v among 457, the degree of v in G_{123v} is strictly less than 2. Therefore, p∘ext(a)=123 is the core support set of a. The corresponding core closed pattern is then int(123)=ab∩ab∩ab i.e. the greatest pattern common to the vertices of this 2core.
Summary
We have briefly presented standard closed pattern mining and FCA together with abstract closed pattern mining in which the support set ext(q) of a pattern q is reduced to its abstract support set p∘ext(q) where p is an interior operator. The abstract closed pattern c associated to q is then the most specific pattern with the same abstract support set. We have then c=int∘p∘ext(q) where the intersection operator int intersects the descriptions of the objects in p∘ext(q). Then we have introduced core closed pattern mining in which p reduces a vertex subset to the core of its induced subgraph. Any such core definition, including the wellknown kcore, relies on a core property P such that P(v,S) holds for all vertices v of the core S. In order to be a core property, P is required to satisfy a monotony condition. Core closed pattern mining consists then in enumerating the set of core closed patterns in an attributed graph.
Biconcept lattices and abstract closed bipatterns
This section is motivated by the extension of core closed pattern mining to twomode networks, i.e. networks in which each edge relates a vertex from a vertex set V_{1} to a vertex from a vertex set V_{2}. The vertices may then be described in two different pattern languages L_{1} and L_{2}. This requires to extend the closed pattern mining and FCA methodology to patterns made of two components and that we call bipatterns. A way to properly define such bipatterns it to first extends the concept lattices of FCA. For that purpose, we need to consider lattice products and will obtain a new Galois connection.
Lattice products are also lattices according to the socalled cartesian ordering:
Proposition 4
Let (X_{1},≤_{1},∨_{1},∧_{1}) and (X_{2},≤_{2},∨_{2},∧_{2}) be two lattices, and consider the cartesian product X=X_{1}×X_{2} together with the binary relation ≤ defined as (x_{1},x_{2})≤(y_{1},y_{2}) iff x_{1}≤_{1}y_{1} and x_{2}≤_{2}y_{2}. Then (X,≤,∨,∧) is a lattice with join and meet defined as:

(x_{1},x_{2})∨(y_{1},y_{2})=(x_{1}∨_{1}y_{1},x_{2}∨_{2}y_{2})

(x_{1},x_{2})∧(y_{1},y_{2})=(x_{1}∧_{1}y_{1},x_{2}∧_{2}y_{2})
We may then build a Galois connection on lattices products (see proof in Appendix 1):
Proposition 5
Let X=X_{1}×X_{2} and L=L_{1}×L_{2} be two lattices product, and let (int_{1},ext_{1}) and (int_{2},ext_{2}) be Galois connections on respective lattices pairs (X_{1},L_{1}) and (X_{2},L_{2}). Consider the mappings int and ext on X and L such that:

int(x_{1},x_{2})=(int_{1}(x_{1}),int_{2}(x_{2}))

ext(l_{1},l_{2})=(ext_{1}(l_{1}),ext_{2}(l_{2}))
then (int,ext)define a Galois connection on (X,L)
In what follows we consider two Galois connections as defined in Proposition 1 and use an interior operator to create the dependency between the two components of the extent which is necessary to represent cores of twomode networks.
Proposition 2 states that applying an interior operator to a lattice involved in a Galois connection preserves the connection. The interior operator in the biconcept case applies to a pair of object subsets, i.e. has domain \(\phantom {\dot {i}\!}X=2^{V_{1}}\times 2^{V_{2}}\):
Definition 2
Let (int,ext) be the Galois connection on (X,L) as defined in Proposition 5 and let p be an interior operator on X. Then, the lattice of the Galois connection (int,p∘ext) on (p[ X],L) is called an abstract biconcept lattice.
The intents of the biconcepts defined this way are what we call abstract closed bipatterns. In a similar way as in abstract closed (single) pattern mining, each such abstract closed bipattern is the most specific bipattern c such that p∘ext(c) where p is an interior operator. However, bipatterns occurrences are gathered in object subset pairs while closure and interior operator are selfmap on lattice products. Abstract closed bipattern mining is illustrated in Example 6 of Appendix 2.
In what follows we apply this methodology to attributed graphs and for that purpose we define such interior operators with respect to pairs of logical properties and use them to give a new definition of cores as vertex subset pairs.
Cores as subset pairs and core closed bipattern mining
In what follows we consider the subnetwork induced by a pair of vertex subsets (W_{1},W_{2}). When considering W_{1}⊆V_{1} and W_{2}⊆V_{2} we simply write (W_{1},W_{2})≤(V_{1},V_{2}) or call (W_{1},W_{2}) a subset pair of (V_{1},V_{2}). The following definition may be applied to a twomode network (V_{1},V_{2},E) as well as to a single mode network by considering V=V_{1}=V_{2}.
Definition 3
Let G=(V_{1},V_{2},E) be a network, the subnetwork induced by the subset pair (W_{1},W_{2}) is the network \(G_{(W_{1},W_{2})}=(W_{1},W_{2},E^{\prime })\) where E^{′} is the edge subset relating vertices from W_{1} to vertices from W_{2}.
To obtain an interior operator, we need to define monotone properties in this context:
Definition 4
\(P_{1}: V_{1}\times 2^{V_{1}} \times 2^{V_{2}} \rightarrow \{true,false\}\) is said monotone if and only if for any w∈V_{1} and any subset pairs (W_{1},W_{2})and \((W_{1}^{\prime },W_{2}^{\prime })\geq (W_{1},W_{2})\),
In the same way, P_{2} defined on \(\phantom {\dot {i}\!}V_{2}\times 2^{V_{1}} \times 2^{V_{2}}\) is monotone whenever for any \(w \in W_{2}, P_{2}(w,W_{1},W_{2})\ \text {implies}\ P_{2}\left (w,W_{1}^{\prime },W_{2}^{\prime }\right)\)
Cores will then be defined thanks to the following result (see proof in Appendix 1):
Proposition 6
Let (P_{1},P_{2}) be a pair of monotone properties, and (W_{1},W_{2}) be a subset pair of (V_{1},V_{2}). Then there exists a greatest subset pair (S_{1},S_{2})≤(W_{1},W_{2}) such that P_{1}(v_{1},S_{1},S_{2}) holds for all elements v_{1} of S_{1} and P_{2}(v_{2},S_{1},S_{2}) holds for all elements v_{2} of S_{2}.
We will further call this subset pair (S_{1},S_{2}) the core subset pair of (W_{1},W_{2}) and define core subgraphs accordingly:
Definition 5
Let G=(V_{1},V_{2},E) be a network, and (P_{1},P_{2}) be a pair of monotone properties. The subnetwork \(G_{(S_{1},S_{2})}\) induced by the core subset pair (S_{1},S_{2}) is called the core subnetwork of G.
We benefit then from a result similar to Proposition 3:
Proposition 7
The operator that reduces a subset pair (W_{1},W_{2})≤(V_{1},V_{2}) to its core subset pair (S_{1},S_{2}) is an interior operator on \(\phantom {\dot {i}\!}2^{V_{1}}\times 2^{V_{2}}\).
Summary
In the same way as in core closed pattern mining, given some bipattern q=(q_{1},q_{2}) we may compute its core support set pair p∘ext(q) where p is an interior operator. This interior operator relies on a pair of core properties that are each required to satisfy a monotony property. The associated core closed bipattern c=(c_{1},c_{2}) is obtained by intersecting componentwise, the vertex descriptions in p∘ext(q). Enumerating these core closed bipatterns defines the bipattern mining task. In the next section we consider various core definitions to apply bipattern mining to twomode, directed and undirected attributed graphs. Note that cores are here vertex subset pairs, which extends the previous (single) core notion referred to in “Cores and closed pattern mining of attributed networks” section.
Core definitions: twoMode, directed and undirected networks
Twomode network cores
According to this new definitions, we first define the ha BHAcore of a twomode network:
Definition 6
The ha BHAcore of the twomode network G is defined through the following pair of core properties:

P_{1}(v,X_{1},X_{2}) holds if and only if the degree of v∈X_{1} in \(G_{(X_{1}, X_{2})}\) is at least h.

P_{2}(v,X_{1},X_{2}) holds if and only if the degree of v∈X_{2} in \(G_{(X_{1}, X_{2})}\) is at least a.
P_{1} and P_{2} are clearly monotone and therefore the ha BHAcore is properly defined. This core definition is equivalent to the definition presented by Cerinsek and Batagelj (2015) in which the pq BHAcore is called the (p,q)core. We provide hereunder an example of an attributed twomode network together with the set of closed bipatterns associated to its ha BHA cores.
Example 2
We consider the twomode network pictured on the leftmost part of Fig. 2. The two vertex sets are V_{1}={l_{1},l_{2},l_{3}} and V_{2}={r_{1},r_{2},r_{3}}. Vertices of V_{1} are labelled by subsets of I_{1}={a,b,c,d} while vertices of V_{2} are labelled by subsets of I_{2}={w,x,y,z}.
The most general bipattern (∅,∅) occurs in the whole network. Its 22 BHAcore is displayed in the middle of Fig. 2 and is induced by (l_{1}l_{2},r_{1}r_{2}r_{3}). We have then as the corresponding closed bipattern int(l_{1}l_{2},r_{1}r_{2}r_{3})=(ab,wx). When adding attributes to this bipattern we obtain subnetworks whose 22 HAcore is empty, except when adding y to wx. The corresponding bipattern (ab,wxy) occurs in (l_{1}l_{2}l_{3},r_{1}r_{3}) whose corresponding 22 BHAcore is displayed in the rightmost part of Fig. 2 and has vertex sets pair (l_{1}l_{2},r_{1}r_{3}). This bipattern is closed as no item can be added without losing some vertex. Furthermore, adding any item to (ab,wxy) results in an empty 22 BHAcore. The corresponding biconcept lattice is therefore the total ordering of the 3 biconcepts ((l_{1}l_{2},r_{1}r_{2}r_{3}),(ab,wx)), ((l_{1}l_{2},r_{1}r_{3}),(ab,wxy)) and ((∅,∅),abcd,wxyz). Also see Fig. 4 the search tree developed for this example by the algorithm we propose in “Bipattern enumeration” section.
Now, let G(V,E) be a single mode network, we may still consider the subgraph induced by a pair of vertex subsets according to Definition 3. This leads to core definitions for undirected and directed networks and in which vertices may have two roles.
Directed network cores : the hub and authority roles
In the directed case we reconsider the property pair of Definition 6 as a property on directed networks and obtain a BHAcore definition that extends the hubauthority core defined in Soldano et al. (2017b). We begin with a definition of the ha BHA core for directed network:
Definition 7
The ha BHAcore of the directed network G is defined through the following pair of core properties:

P_{1}(v,X_{1},X_{2}) holds if and only if the outdegree of v∈X_{1} in \(G_{(X_{1}, X_{2})}\) is at least h.

P_{2}(v,X_{1},X_{2}) holds if and only if the indegree of v∈X_{2} in \(G_{(X_{1}, X_{2})}\) is at least a.
The BHAcore of directed networks extends the hub authority (HA) core definition:
Proposition 8
Let G=(V,E)be a directed network, let (S_{H},S_{A})be its ha BHAcore, and let H∪A be its ha HA core where H and A are its hub and authority vertex subsets. Let then p_{BHA} and p_{HA} be the core operators respectively associated with the BHA core and the HA core, we have then:
for any vertex subset X.
Undirected network cores : the star and satellite roles
The previous section showed that bipattern mining could be applied to directed networks as far as each bipattern component were associated to one of the in and out roles of the vertices. In what follows we extend the knearstar core which was defined on undirected networks, and exploits the two roles it relies on (Soldano and Santini 2014). This new core is called the k StSa core referring to the “Star” and “Satellite” roles: a star vertex is required to have degree at least k while its neighbours have the satellite role. The k StSa core subgraph is then the subgraph induced by its Star and Satellite vertex subsets as defined below:
Definition 8
The k StSacore of the undirected network G is defined through the following pair of core properties:

P_{1}(v,X_{1},X_{2}) holds if and only if the degree of v∈X_{1} in \(G_{(X_{1}, X_{2})}\) is at least k.

P_{2}(v,X_{1},X_{2}) holds if and only if there exists some edge xv such that P_{1}(x,X_{1},X_{2}) holds
In the corresponding core subset pair (St,Sa),St is called the star vertex subset and Sa the satellite vertex subset.
Starsatellite bipattern mining will be exemplified on an undirected bibliographical network in “StarSatellite bipatterns in a bibliographical network” section.
Computing the interior of (X _{1},X _{2}) and enumerating abstract closed bipatterns
Computing interiors
We present now the generic algorithm Interior that computes the interior p(X_{1},X_{2})=(S_{1},S_{2}) associated to the pair of monotone properties (P_{1},P_{2}). In the bipartite case, i.e. when V_{1}∩V_{2}=∅, the algorithm is basically a rewriting of the algorithm proposed in Cerinsek and Batagelj (2015). When considering X_{1}=X_{2}, Interior is similar to the algorithm proposed in Soldano et al. (2017b) to compute the directed HAcore. Let n be the number of vertices and m be the number of edges, the algorithm performs at most n iterations while the inner loop needs \(\mathcal {O}(m)\) operations as far as p needs only to access the neighbourhood of each vertex. The overall complexity is then \(\mathcal {O}(m*n)\). A more efficient algorithm in \(\mathcal {O}(m * \max (\Delta, \log n))\), where Δ is the highest degree within the graph, is obtained by adapting the variant cited in Batagelj and Zaversnik (2011) which uses two heaps as data structures for the vertex subset associated to each mode.
The following example illustrates how Interior computes the StSa core of an undirected network:
Example 3
Let G=(V,E) be an undirected graph with V=12345 and E={12,13,23,34,45}. We consider its 3 StSa core. Execution of Interior(V,V) starts with S_{1}=S_{2}=12345 and results in the following iterations:

1.
Z_{1}=12345 and Z_{2}=12345 and then vertices 1245 are removed from S_{1} as their degree in G is less than 3 while 3 and 5 are removed from S_{2} as in G there is neither and edge x3 nor an edge x5 such that the degree of x is at least 3.

2.
Z_{1}=3 and Z_{2}=124 and no vertex is removed from S_{1}=Z_{1} as degree of vertex 3 in G_{(3,124)} still is 3. In the same way no vertex is removed from S_{2}=Z_{2} as undirected edges 31, 32, 34 are in G_{(3,124)}. As a result Z_{1}=S_{1} and Z_{2}=S_{2} and the iterations stop.
We note in this example that i) only one iteration is necessary to converge, which is always the case when computing k StSa cores and ii) St=3 and Sa=124 are disjoint, but this is not necessarily the case as, for instance, when adding edges 46 and 47 to G. In the new graph we obtain St=34 and Sa=1234567 as 3 and 4 are both stars and neighbours of each other (Fig. 3).
Bipattern enumeration
We focus now on abstract closed bipattern enumeration. Building the biconcept lattice has therefore to be a postprocessing step. The enumeration follows the same process as abstract closed pattern enumeration, i.e. the efficient divide and conquer scheme described in Boley et al. (2010) as implemented in the MinerLC software. The adaptation is straightforward: the closure operator is now f_{A}=int∘p∘ext where p is the interior operator as defined above. To perform enumeration of abstract closed bipatterns we specialize each abstract closed bipattern (q_{1},q_{2}) by adding either an element of I_{1} to q_{1} or an element of I_{2} to q_{2}.
The algorithm bipatterns is described below with the following notations:
Let q=(q1,q2) be a bipattern, i) add(i,q) returns either (q_{1}∪i,q_{2}) when i∈I_{1} or (q_{1},q_{2}∪i) when i∈I_{2}, ii) minus(I,q) returns the set of items which belong neither to the left part nor to the right part of the bipattern q=(q_{1},q_{2}), i.e. minus(I,q) = I_{1}∖q_{1}∪I_{2}∖q_{2}. iii) The exclusion pair list EL is a subset pair of (I_{1},I_{2}).
Example 4
We follow on from Example 2 and consider s=1 as the minimum support. The algorithm starts by computing the 22 HAcore G_{c} of the whole graph G. G and G_{c} are displayed respectively on the left and on the middle of Fig. 2. Function enum is then called with the core closed pattern q=int(vs(G_{c}))=int(l_{1}l_{2},r_{1}r_{2}r_{3})=(ab,wx) and first outputs the pair ((ab,wx),(l_{1}l_{2},r_{1}r_{2}r_{3}), and then adds to q in turn each item in minus(I,q)=(cd,yz):

add(c,q))=(abc,wx) selects a subgraph whose core is empty. As a result the branch is pruned as smaller subgraphs would also result in an empty core.

add(d,q))=(abd,wx) selects also a subgraph whose core is empty.

add(y,q))=(ab,wxy) selects (l_{1}l_{2}l_{3},r_{1}r_{3}) whose core displayed on the right of Fig. 2 has vertex set (l_{1}l_{2},r_{1}r_{3}). The core closed bipattern q_{x}=(ab,wxy) is computed and having null intersection with the empty list EL leads to another recursive call of enum. This call will output the pair (q_{x},(l_{1}l_{2},r_{1}r_{3})) but there will be no deeper recursive calls as 22 HA structure with strictly less than four nodes are excluded. We have then EL set to = {y} prior to the next iteration.

add(z,q))=(ab,wxz) selects a subgraph whose core is empty.
As enum ends bipatterns also ends. The two closed bi patterns that have been output are the most specific bipatterns that occur respectively in the 22 BHAcores, displayed on the middle and the right of Fig. 2. The search tree is represented Fig. 4.
Experiments
The first experiment concerns an original twomode network, the second concerns a wellknown directed social network available on the minerLC web page while the third one is an attributed undirected bibliographical network. The actual implementation, as part of the minerLC suite, relies on a preprocessing of the dataset that transforms the original network into a new network. Closed bipatterns are then represented as single patterns whose items are prefixed by a role. Note that in this section there is no comparison with other programs or methods, as the task of bipattern mining is new as far as we know. However regarding the second dataset, we display a single pattern core subgraph, obtained in a previous work, together with a bipattern core subgraph sharing some nodes with the former.
ha BHA bipatterns in a twomode network of epistemological data
We are currently investigating a twomode network concerning data related to a MNHNIRD program (called MUSORSTOM then Tropical DeepSea Benthos) of expeditions exploring the deepsea in the IndoWest Pacific region, since 1976 (Bary 2018). In this network 596 edges relate 74 campaigns (V_{1}) to 268 participants (V_{2}). Campaigns are described following their date and location, the type of fishing gear (dredge, trawl), the objectives of the campaign as well as species described during the campaign. Regarding participants, the attributes concern the location of the institution they belong to, their scientific domain as well as bibliometrics. We have in particular searched biconcepts associated to 34 HA cores (subnetworks with participants to at least 3 campaigns with at least 4 participants to these campaigns). As an illustration Fig. 5 displays the respective 34 HBAcores S=(S_{1},S_{2}) and \(S^{\prime }=\left (S^{\prime }_{1},S^{\prime }_{2}\right)\) of two bipatterns q and q^{′}. The corresponding core subgraphs contains respectively S_{1}+S_{2}=80 vertices and \(S^{\prime }_{1}+S^{\prime }_{2}=76\) vertices. Vertices are displayed at their original position in the whole network according to a standard force directed drawing (Kobourov 2013). The difference between the extents are mainly in the left part of the network, i.e. the part that corresponds to campaigns before 2000 which means that differences concern campaigns and participants which are strongly related within the original network.
ha BHA bipatterns in a Lawyer Advice directed network
This dataset concerns a network study of corporate law partnership that was carried out from 1988 to 1991 in New England (Lazega 2001). It concerns 71 attorneys (partners and associates). The vertices 1 to 36 represent partners while vertices 37 to 71 represents associates, i.e. attorneys with a lower position in the firm. In the Advice network^{Footnote 3}, each attorney is described using various attributes, and 892 directed edges xy relate attorney x who goes to attorney y for basic professional advice. This network was investigated in Soldano et al. (2017b) applying the abstract closed pattern methodology using the HAcore definition. We use here the same attributed network as found in the minerLC web page (see above).
There may be many bipatterns when considering a single mode network as their number is quadratic in the number of single patterns in the same network. We will focus on bipatterns associated to cores which are unlikely to appear as cores of single patterns. In this way, bipattern analysis is complementary to single pattern analysis. For that purpose we define the homogeneity of a bipattern as the Jaccard similarity of its components support sets. Homogeneity is then 1 when q_{1}=q_{2} and 0 when q_{1} and q_{2} never both occur in the same vertex. We will then select bipatterns with low homogeneity.
Definition 9
(Homogeneity of a bipattern q=(q_{1},q_{2}))
We apply our bipattern methodology using the 99 BHAcore which corresponds to a 99 HAcore as far as we have equal input vertex subsets W_{1}=W_{2}=W (see Proposition 8). As an example, we consider the following closed bipattern q=(q_{1},q_{2}) where
q_{1}={ 25<Age≤50, Seniority ≤ 25} and
q_{2}={ 30<Age≤65,5<Seniority}.
This bipattern is the abstract closed bipattern with least homogeneity among the 82 abstract closed bipatterns. It represents a group of young lawyers seeking advices from older lawyers who are in the firm for more than five years. We observe that 68 vertices over the 71 vertices of the whole advice network satisfy what is common to q_{1} and q_{2} i.e. satisfy q_{1}∩q_{2}={ 25<Age≤65}. Only 24 vertices among these 68 satisfy both patterns q_{1} and q_{2} resulting in homogeneity h(q)=0.368. The 99 BHAcore subgraph of q is displayed Fig. 6. It is made of 33 vertices 13 of which are both in H and A vertex subsets. Note that the 99 HA core associated to the single abstract closed pattern { 25<Age≤65} is much larger: it contains 50 vertices with H∩A=23 and also is the 99 HAcore of the whole graph.
We also experimented with a weaker 66 BHAcore abstraction, then resulting in 32010 abstract closed bipatterns among which 262 have homogeneity less than 0.1. There were in particular 7 bipatterns with null homogeneity, one among which represents lawyers from Boston whose law domain is litigation. In this bipattern 7 associate lawyers with age between 26 and 45 and seniority no more than 5 years go for advice to 7 older lawyers (both partners and associates) with age between 31 and 60 and seniority more than 6. The associated core subgraph is displayed on the right part of Fig. 7. This bipattern reflects the composition and cohesion of one of the relatively stable teams of lawyers on the litigation side in this Boston office. It shows the very special proximity in this team between, on the one hand, Partners 13, 21, 24 and 26 as well as senior Associates 38, 39 and 40 (in red) and on the other hand the more junior Associates (in blue) who seek advice from the former. A single Pattern 44 BHAcore, previously discussed in Soldano et al. (2017b), is displayed on the left of Fig. 7 and identifies an even stronger tie between these Partners and senior Associate 40 who, in 1991, was sought out for advice by the Partners themselves in breach of the unspoken status rule related to advice seeking (’You do not seek advice from others lower in the social pecking order’). In [13, page 107], blockmodelling clustered Associates 38 and 40 in these Partners’ position (Position One) as structurally equivalent to them, an exceptional status heterogeneity. A year later, still as exceptionally, Associate 40 (male) was made partner. More senior Associates 38 and 39 (both female) had to wait for longer (Associate 38 made it to partnership two years later). Based on the up or out rule, Associate 39 (who was not part of Position One to begin with) had to leave the firm. Inspection of these pattern and bipattern thus captures a very real process.
Finally we conduct experiments involving 44 BHAcores resulting in 293 490 bipatterns, found in few minutes^{Footnote 4}, to be compared to the 930 single patterns observed in Soldano et al. (2017b).
StarSatellite bipatterns in a bibliographical network
We also investigated the coauthoring network DBLP.E extracted from the DBLP database. DBLP.E is part of a family of networks used in various experiments on graph mining (Galbrun et al. 2014). To build the vertices description, first the terms in the titles of the author’s articles were gathered and stemmed. Stopwords as well as terms that occur with more than 60% of the authors were then removed. Finally, each researcher is labelled by the terms whose occurrence count si higher than one percent of the total volume of terms for that researcher. The network is the egonetwork of radius 2 of coauthors of George Karypis and has 721 authors connected by 1427 undirected coauthoring links. The maximum vertex degree is 68 and the average vertex degree is 3.95. Each vertex is described by a subset of labels among a set of 2782 labels and the average vertex description size is 23.9. We experimented bipattern mining with 20StarSatellites cores. The core of the whole network is made of 17 stars among a total of 589 nodes in the core. We display Fig. 8 this core subgraph in which blue nodes represent stars and red nodes represent satellites. Note that all blue nodes are also red nodes. This means that any star, i.e. an author with at least 20 coauthors, is also a satellite of, i.e. is connected to, another star.
We obtained 214 bipatterns among which we found in particular bipatterns representing single stars with all their satellites. Most of such bipatterns have the form (d(s),∅) where d(s) is the description of the star s and in which the satellites have no common label. When considering homogeneity as defined above, these single star bipatterns have low homogeneity. We also found bipatterns made of a single star with null homogeneity, meaning the coauthors of this single star in the core subgraph have at least one common label they do not share with the star. We display Fig. 9 two such bipatterns sharing the same single star.
With low homogeneity we also have a bipattern representing a pair of coauthors, namely Jianyong Wang and Lizhu Zhou), who are both stars and satellites (since an edge relate them). Such a bipattern represents a close cooperation between two senior researchers. Conversely, we have a bipattern with two unconnected stars, namely Mohammed J. Zaki and Jianyong Wang, who share labels {cluster,data,databas,efficy,frequ,graph,mine,pattern} but no satellites, thus suggesting some competition on close subjects. The corresponding core subgraphs are displayed Fig. 10.
Scalability
First note that we did not use any constraint on the cores size, i.e. we considered s=1 as a minimum size threshold. This is a rather general situation: the topological constraint associated with the core definition allows a better exploration of patterns occurrences since strengthening the constraint, i.e. increasing h or a, decreases the number of closed patterns, therefore allowing to find unfrequent patterns. Now, the first two networks in our experiments are rather small and dense networks whose vertices have a detailed description. Scalability of the enumeration depends on the cost of core computation as well as the number of bipatterns to output. Core computation is efficient as far as the logical property P only depends on neighbours of the considered vertex (Batagelj and Zaversnik 2011), and has been performed on very large networks. Regarding the closed pattern enumeration, our algorithm is based on an efficient topdown general algorithm (Boley et al. 2010) and the implementation uses data reduction techniques borrowed from (Negrevergne et al. 2013). However the scalability, as mentioned above, depends on the number of bipatterns to generate. This number depends on the size of the pattern language and bipattern mining means a pattern space which size is the product of the single pattern spaces. Note that though the vertices of the undirected network ICDM_E are described in a large language, each vertex is described with a small number of terms. As a consequence the number of bipatterns with different cores is limited and the enumeration stops after few minutes (namely 470 s). We still have to experiment bipattern mining on large attributed networks of hundred thousands of nodes and edges. The ICDM_E case shows that, as far as we consider strong enough core definitions, we may investigate a large network in a reasonable time. In the general case there may be a large number of bipatterns to investigate (see, for instance, the 44 BHA experiments at the end of “ha BHA bipatterns in a Lawyer Advice directed network” section). Only considering, as a postprocessing, bipatterns with low homogeneity allows to reduce the number of patterns to examine, while selecting unexpected patterns, adapting the method from (Soldano et al. 2017b), should also be efficient. Finally, in order to present to domain experts a limited number of interesting patterns, we still need some way, as the Minimum Description Length pattern selection scheme (see for instance Spyropoulou et al. (2014)), to sample among bipatterns associated with similar cores.
Summary
Table 1 summarizes the various two component cores used in the bipattern mining problems we have investigated. The definitions are very close but concern different types of networks. More core definitions are obviously possible as far as monotony of properties pair, as defined in Definition 4, is satisfied. For sake of simplicity the BHA and StSa cores have been defined using subgraphs induced by vertex subset pairs, according to Definition 3. However, this is not mandatory and could preclude some interesting core definitions. For instance, core definitions designed to constrain some coreperiphery structure should take also into account edges relating nodes within one of the vertex subset pair.
Conclusion
In this article we have extended the core closed pattern methodology in order to address twomode attributed networks. For such networks there were no methodology, to the best of our knowledge, to extract subnetworks according to constraints on both topology and attributes. For that purpose, we have first extended the core notion: a core subgraph is now induced from a pair of vertex subsets. In each vertex subset the nodes have to satisfy an associated topological property. We may then start from any vertex subset pair and reduce this subset pair to its core, according to this new definition. We have then defined a bipattern as a pattern pair each component of which selects a vertex subset. This leads to define core closed bipattern mining which is a new and natural way to investigate attributed twomode networks: each component of a bipattern select the nodes associated to a mode. We have also provided efficient algorithms to extract cores and enumerate core closed bipatterns.
Closed bipattern mining as defined here may be applied to single mode networks when considering nodes separately according to two different roles. In directed networks we may then straightforwardly consider the in and out roles of nodes. In undirected networks we may still apply bipattern mining as far as the core definition relies on two different roles, as exemplified when introducing the starsatellite core. In these single mode networks bipattern mining allows to extract information which is not accessible using standard pattern mining: we may rank or select bipatterns with low homogeneity i.e. whose components select vertex subsets with a limited or null overlap. This allows for instance to extract bipatterns representing young lawyers asking for advice to older lawyers or representing a group of coauthors made of senior researchers sharing a large list of keywords together with a set of junior coauthors who share few or no keywords.
It should be emphasized i) that the results and definitions presented in this article may be extended to multiple patterns i.e. tuples rather than pairs, and therefore to the analysis of multi mode or multi role networks, and ii) that by using appropriate core and multipattern definitions, the methodology may also be extended to multiplex networks i.e. basically to address general linked data. For instance, the core of a multiplex network may be obtained in the same way as the BHA core of a directed network: as edges have a type we may associate a node degree with each edge type, associate a role to each edge type and require nodes to have a sufficient degree to belong to the corresponding role component in the core. We could then investigate, for instance, gene regulation networks by considering two different types of regulation: a regulator may either increase or decrease the gene expression. Note that in this case edges have both a direction and a type. There is no technical difficulty in defining appropriate cores in such situations, but of course core definitions as well as multipattern definitions, should be accurately designed according to the questions we intend to investigate: we may or not be interested in the direction according to the specific biological question we consider.
Appendix 1: Notations, Definitions and Proofs
Table 2 summarizes the main notations regarding bipattern mining on attributed graphs.
Closed bipatterns are ordered in a biconcept lattice whose definition relies, as the concept lattice definition, on the Galois connection between an extensional and an intensional space. We denote both order relations by the set theory inclusion symbols.
Definition 10
Let(L,⊆) and (X, ⊆) be two lattices. Let int and ext be two maps defined on X and L by
int: X →L
ext: L→X
and such that:
C1 ∀e,e^{′}∈X, e ⊆e^{′} implies int(e)⊇int(e^{′})
C2 ∀c,c^{′}∈L, c ⊆c^{′} implies ext(c)⊇ext(c^{′})
C3 ∀c∈L, c ⊆int(ext(c)), and ∀e ∈E, e ⊆ext(int(e))
Then (int,ext) define a Galois connection on (X,L)
Proposition 5 is then straightforward according to the componentwise defintion of the orders on pairs X=(X_{1},X_{2}) and L=(L_{1},L_{2}).
Note that in closed pattern mining the Galois Connection definition is not always mentioned as such since results focus on the closure operator on the pattern language. Still, it is a simple way using Propositions 5 and 2 to obtain abstract closed bipatterns as well as their partial ordering.
The proof of Proposition 6 is also straightforward:
Proof
Let (P_{1},P_{2}) be a pair of monotone properties, and (W_{1},W_{2}) be a subset pair of (V_{1},V_{2}). Then there exists a greatest subset pair (S_{1},S_{2})≤(W_{1},W_{2}) such that P_{1}(v_{1},S_{1},S_{2}) holds for all elements v_{1} of S_{1} and P_{2}(v_{2},S_{1},S_{2}) holds for all elements v_{2} of S_{2}.
As we consider the finite case, there are maximal subset pairs such that the required condition (referred to as C) is satisfied. We will assume that there are two maximal pairs (S_{1},S_{2}) and \(\left (S_{1}^{\prime },S_{2}^{\prime }\right) \) that satisfy C. i) This means that for any element v of S_{1} we have that P_{1}(v,S_{1},S_{2}) holds, and as P_{1} is monotone we also have that \(P_{1}\left (v,S_{1}\cup S_{1}^{\prime },S_{2}\cup S_{2}^{\prime }\right)\) holds. In the same way, for any element v of \(S_{1}^{\prime }\) we have that \(P_{1}\left (v,S_{1}\cup S_{1}^{\prime },S_{2}\cup S_{2}^{\prime }\right)\) also holds. This means for any element v of \(S_{1} \cup S_{1}^{\prime }\) we have that \(P_{1}\left (v,S_{1}\cup S_{1}^{\prime },S_{2}\cup S_{2}^{\prime }\right)\) holds. ii) The same reasoning regarding \(S_{2}, S_{2}^{\prime }\) and P_{2} shows that for any element v of \(S_{2} \cup S_{2}^{\prime }\) we have that \(P_{2}\left (v,S_{2}\cup S_{1}^{\prime },S_{2}\cup S_{2}^{\prime }\right)\) holds. From i) and ii) we conclude that \(\left (S_{1}\cup S_{1}^{\prime }, S_{2} \cup S_{2}^{\prime }\right)\) satisfy condition C, and is therefore greatest than both (S_{1},S_{2}) and \(\left (S_{1}^{\prime },S_{2}^{\prime }\right)\). As both pairs are maximal subset pairs satisfying C, this means that \(S_{1}=S_{1}^{\prime }\) and \(S_{2}=S_{2}^{\prime }\). □
Appendix 2: Examples of abstract closed pattern and bipattern mining
In this section, we exemplify abstract closed pattern mining discussed in “Abstract closed pattern mining and concept lattices” section and abstract closed bipattern mining presented in “Biconcept lattices and abstract closed bipatterns” section. we first note an useful one to one correspondance between interiors operators on a lattice and their range (see (Blyth 2005) for the dual result on closure operators):
Proposition 9
Let X be a complete lattice. A subset A of X is the range of an interior operator on X if and only if A is closed under join. The interior operator f:X→X is then unique and defined as f(x)=∨_{{a∈A∣a≤x}}a.
We further call A an abstraction of X, hence we may define abstract concept lattices through interior operators as well as abstractions. By A is closed under join means we intend that the join of any subset {W_{1},…,W_{n}} of A, including the empty subset ∅, belongs to A. In the bipattern case, X is a pair \(\left (2^{V_{1}},2^{V_{2}}\right)\) of powersets and an element W of A is a pair of object subsets.
We give now a simple example of abstract closed pattern mining.
Example 5
We exemplify the closure operator f=int∘ext returning closed patterns in the standard closed itemset mining case. We further write subsets as strings, i.e. 12 stands for {1,2}. Patterns are subsets of I={a,b,c,d}, objects in V={1,2,3} are described as d[ V]={a,ab,abc}. We have then ext(b)=23 and as a consequence, f(b)=d(2)∩d(3)=ab∩abc=ab,f(abc)=d(3)=abc and f(d)=abcd. The latter closure means that d is in the set of patterns with empty support set whose greatest element is abcd.
Now, to exemplify abstract closed patterns, we consider the operator p on 2^{V} such that p(e)=e except for singletons whose images are the empty set: p(1)=p(2)=p(3)=∅. It is straightforward following Definition 1 that p is an interior operator and as a consequence of Proposition 2, f=p∘int∘ext is a closure operator. As we have p∘ext(ab)=p(23)=23, we obtain that f(ab)=abc as in the nonabstract case. However p∘ext(abc)=p(3)=∅ and now f(abc)=abcd is the greatest element with empty abstract support set.
The corresponding abstraction A=p[2^{123}] is generated by union closure of size 2 subsets {12,23,13} and it is straightforward that for any e, p[ e] is the greatest subset of A smaller than or equal to e. For instance, p[ 12]=12 as 12 belongs to A while p[ 1]=∅ as no element of A except ∅ is included in subset 1.
We provide hereunder an example of closed bipattern mining that makes use of Proposition 9 to represent the interior operator.
Example 6
Let V_{1}={1,2} and V_{2}={3,4} be two object sets and \(\phantom {\dot {i}\!}X_{1}=2^{V_{1}}=\{\emptyset, 1,2,12\}\) while \(\phantom {\dot {i}\!}X_{2}=2^{V_{2}} =\{\emptyset, 3,4,34\}\). Objects of V_{1} are labelled by subsets of I_{1}={a,b,c} while objects of V_{2} are labelled by subsets of I_{2}={w,x}. The descriptions of the objects from V_{1} and V_{2} respectively as subsets of I_{1} and I_{2} are as follows:

d_{1}(1)=ab,d_{1}(2)=b,d_{2}(3)=wx,d_{2}(4)=x
Consider the abstraction {(∅,∅),(1,4),(2,3),(12,34)} and the associated interior operator p. Now, we have that

p(12,34)=(12,34),

p(1,34)=(1,4) and int(1,4)=(ab,x)

p(12,3)=(2,3) and int(2,3)=(b,wx)

p(1,3)=p(∅,3)=(∅,∅) and int(∅,∅)=(abc,wx)
We obtain then the abstract biconcept lattice displayed Fig. 11. The set of abstract closedbipatterns with extent different from (∅,∅) is then {(b,x),(ab,x),(b,wx)}.
Appendix 3: Supplementary details on experimental results
Notes
 1.
 2.
Galois connections are defined in Appendix 1
 3.
Available at: https://www.stats.ox.ac.uk/~snijders/siena/Lazega_lawyers_data.htm
 4.
673 s on a 4core 2,2 GHz Intel Core i7 computer
References
Atzmueller, M, Doerfel S, Mitzlaff F (2016) DescriptionOriented Community Detection using Exhaustive Subgroup Discovery. Inf Sci 329:965–984.
Baroni, A, Conte A, Patrignani M, Ruggieri S (2017) Efficiently clustering very large attributed graphs In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining ASONAM ’17, 369–376.. ACM, New York.
Bary, S (2018) Scientific representations of biodiversity in the deepsea : an epistemologic and scientific approach. PhD thesis, Ecole Doctorale numéro 474, Sorbonne Paris Cité. Defended October 10th 2018.
Batagelj, V, Zaversnik M (2011) Fast algorithms for determining (generalized) core groups in social networks. Adv Data Anal Classif 5(2):129–145.
Blyth, TS (2005) Lattices and Ordered Algebraic Structures. Universtext. Springer.
Boley, M, Horváth T, Poigné A, Wrobel S (2010) Listing closed sets of strongly accessible set systems with applications to data mining. Theor Comput Sci 411(3):691–700.
Borgatti, SP, Everett MG (1997) Network analysis of 2mode data. Soc Netw 19(3):243–269.
Borgatti, SP, Everett MG (2000) Models of core/periphery structures. Soc Netw 21(4):375–395.
Cerinsek, M, Batagelj V (2015) Generalized twomode cores. Soc Netw 42:80–87.
Combe, D, Largeron C, Géry M, EgyedZsigmond E (2015) ILouvain: An attributed graph clustering method. In: Fromont E, De Bie T, van Leeuwen M (eds)Advances in Intelligent Data Analysis XIV. IDA 2015. Lecture Notes in Computer Science, vol. 9385, 181–192.. Springer, Cham.
Galbrun, E, Gionis A, Tatti N (2014) Overlapping community detection in labeled graphs. Data Min Knowl Discov 28(56):1586–1610.
Gao, H, Huang H (2018) Deep attributed network embedding In: Proceedings of the TwentySeventh International Joint Conference on Artificial Intelligence, IJCAI18, 3364–3370.. International Joint Conferences on Artificial Intelligence Organization.
Giatsidis, C, Thilikos DM, Vazirgiannis M (2013) Dcores: measuring collaboration of directed graphs based on degeneracy. Knowl Inf Syst 35(2):311–343.
Kleinberg, JM (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632.
Kobourov, SG (2013) Forcedirected drawing algorithms. In: Tamassia R (ed)Handbook on Graph Drawing and Visualization, 383–408.. CRC Press.
Lazega, E (2001) The Collegial Phenomenon: The Social Mechanisms of Cooperation Among Peers in a Corporate Law Partnership. Oxford University Press.
Mougel, PN, Rigotti C, Gandrillon O (2012) Finding collections of kclique percolated components in attributed graphs. In: Tan PN, Chawla S, Ho CK, Bailey J (eds)Advances in Knowledge Discovery and Data Mining. PAKDD 2012. Lecture Notes in Computer Science, vol. 7302, 181–192.. Springer, Berlin.
Negrevergne, B, Termier A, Rousset MC, Méhaut JF (2013) Paraminer: a generic pattern mining algorithm for multicore architectures. Data Min Knowl Discov 28:593–633.
Pernelle, N, Rousset MC, Soldano H, Ventos V (2002) Zoom: a nested Galois latticesbased system for conceptual clustering. J Exp Theor Artif Intell 2/3(14):157–187.
Sánchez, PI, Müller E, Korn UL, Böhm K, Kappes A, Hartmann T, Wagner D (2015) Efficient algorithms for a robust modularitydriven clustering of attributed graphs. In: Venkatasubramanian S Ye J (eds)Proceedings of the 2015 SIAM International Conference on Data Mining, Vancouver, BC, Canada, April 30  May 2, 2015, 100–108.. SIAM.
Seidman, SB (1983) Network structure and minimum degree. Soc Netw 5:269–287.
Silva, A, Meira Jr W, Zaki MJ (2012) Mining attributestructure correlated patterns in large attributed graphs. Proc VLDB Endow 5(5):466–477.
Soldano, H, Santini G (2014) Graph abstraction for closed pattern mining in attributed networks. In: Schaub T, Friedrich G, O’Sullivan B (eds)European Conference in Artificial Intelligence (ECAI). Frontiers in Artificial Intelligence and Applications, vol. 263, 849–854.. IOS Press.
Soldano, H, Ventos V (2011) Abstract Concept Lattices. In: Valtchev P Jäschke R (eds)Formal Concept Analysis. ICFCA 2011. Lecture Notes in Computer Science, vol. 6628, 235–250.. Springer, Heidelberg.
Soldano, H, Santini G, Bouthinon D (2017a) Formal concept analysis of attributed networks. In: Missaoui R, Obiedkov S, Kuznetsov S (eds)Formal Concept Analysis of Social Networks. Lecture Notes in Social Networks, 143–170.. Springer, Cham.
Soldano, H, Santini G, Bouthinon D, Lazega E (2017b) Hubauthority cores and attributed directed network mining In: International Conference on Tools with Artificial Intelligence (ICTAI).. IEEE Computer Society, Boston.
Soldano, H, Santini G, Bouthinon D, Bary S, Lazega E (2018) Bipattern mining of two mode and directed networks In: WWW (Companion Volume), 1287–1294.. ACM.
Spyropoulou, E, Bie TD, Boley M (2014) Interesting pattern mining in multirelational data. Data Min Knowl Discov 28(3):808–849.
Xu, Z, Ke Y, Wang Y, Cheng H, Cheng J (2012) A modelbased approach to attributed graph clustering In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, SIGMOD ’12, 505–516.. ACM, New York.
Yang, J, McAuley J, Leskovec J (2013) Community detection in networks with node attributes In: 2013 IEEE 13th International Conference on Data Mining, 1151–1156.. IEEE Computer Society.
Acknowledgments
Not Applicable.
Funding
This research has received funding from the Project Chistera Adalab (ANR14CHR2000104).
Author information
Affiliations
Contributions
All authors equally contributed to the whole manuscript. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Henry Soldano.
Ethics declarations
Ethics approval and consent to participate
Not Applicable.
Consent for publication
Not Applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional information
Availability of data and materials
The datasets and program sources are available at https://lipn.univparis13.fr/MinerLC/ under a particular “Submission to Applied NetWork Science” section. Datasets are given in the format required by MinerLC.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 Closed pattern mining
 Core subgraph
 Attributed network
 Twomode network
 Directed network