The directed chained graphs introduced in this section generalize the notion of undirected chained graphs defined in Concas et al. (2021).
Definition 1
A directed graph \({{\mathcal {G}}}= \{{{\mathcal {V}}},{{\mathcal {E}}}\}\) is said to be directed \(\ell\)-chained, with initial vertex \(v_i\), if the set of vertices can be subdivided into \(\ell\) disjoint non-empty subsets \({{\mathcal {V}}}_1,{{\mathcal {V}}}_2,\ldots ,{{\mathcal {V}}}_\ell\), see (2.2), such that \(v_i\in {{\mathcal {V}}}_1\) and all edges from vertices in the set \({{\mathcal {V}}}_j\) point to vertices in the set \({{\mathcal {V}}}_{j+1}\) for \(j=1,2,\ldots ,\ell -1\), where the chain length \(\ell\) is the largest number of vertex subsets \({{\mathcal {V}}}_j\) with this property. The vertex subset \({{\mathcal {V}}}_{j+1}\) is said to be adjacent to the vertex set \({{\mathcal {V}}}_j\).
The chain length \(\ell\) of a directed \(\ell\)-chained graph may depend on the choice of the initial vertex \(v_i\). After a suitable permutation of the nodes, the adjacency matrix A of a directed \(\ell\)-chained graph \({{\mathcal {G}}}= \{{{\mathcal {V}}},{{\mathcal {E}}}\}\) becomes upper block bidiagonal with zero diagonal blocks,
$$\begin{aligned} A = \begin{bmatrix} O & A_1 \\ & O & A_2 \\ & & O & A_3 \\ & & & \ddots & \ddots \\ & & & & O & A_{\ell -1} \\ & & & & & O \end{bmatrix}, \end{aligned}$$
(3.1)
where the submatrix \(A_i\in {{\mathbb {R}}}^{n_i\times n_{i+1}}\) describes the connections from vertices in \({{\mathcal {V}}}_i\) to vertices in \({{\mathcal {V}}}_{i+1}\), for \(i=1,2,\ldots ,\ell -1\).
Example 3.1
Consider the graph of Fig. 1. This is a 3-chained graph with the chained node sets \({{\mathcal {V}}}_1=\{v_1,v_2\}\), \({{\mathcal {V}}}_2=\{v_3\}\), and \({{\mathcal {V}}}_3=\{v_4\}\). The initial node can be chosen to be either \(v_1\) or \(v_2\). The adjacency matrix is
$$\begin{aligned} A=\left[ \begin{array}{cccc} 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \end{array}\right] , \end{aligned}$$
where we can choose the submatrices
$$\begin{aligned} A_1=\left[ \begin{array}{c} 1 \\ 1 \end{array}\right] \in {{\mathbb {R}}}^{2\times 1},\qquad A_2=\left[ \begin{array}{c} 1 \end{array}\right] \in {{\mathbb {R}}}^{1\times 1}. \end{aligned}$$
Assume that a graph is known to be directed \(\ell\)-chained for some \(\ell \ge 1\), but that the value of \(\ell\) is not known. Moreover, let a permuted version of the matrix (3.1) be known (for some unknown value of \(\ell\)). Thus, the available adjacency matrix is of the form
$$\begin{aligned} {\widetilde{A}} = P A P^T, \end{aligned}$$
where P is a permutation matrix that modifies the vertex ordering. Given the adjacency matrix \({\widetilde{A}}\), we are interested in determining the vertex subsets \({{\mathcal {V}}}_1,{{\mathcal {V}}}_2,\ldots ,{{\mathcal {V}}}_\ell\) in Definition 1, as well as the number of sets \(\ell \ge 1\). A method for determining if a directed graph is \(\ell\)-chained and partitioning the nodes into subsets is described by Algorithm 1. Given an adjacency matrix A of a directed graph, the first node subset \({{\mathcal {V}}}_1\) is obtained by considering the column indices j such that \(A_{ij}=0\) for each row index i; see line 1 of the algorithm. Then the other vertex subsets are determined by identifying the blocks in A that describe connections with nodes in the preceding node subset (line 6). If it is not possible to determine the first vertex set, or if during the process it results that some node is connected to a vertex in a preceding subset, then the graph is not \(\ell\)-chained. This process gives a constructive proof of the following result.
Proposition 1
Let \({{\mathcal {G}}}=\{{{\mathcal {V}}},{{\mathcal {E}}}\}\) be a directed graph. Then it is possible to detect if it possesses an \(\ell\)-chained structure and determine the number of subsets, \(\ell\), as well as the vertex set partitioning \({{\mathcal {V}}}={{\mathcal {V}}}_1\cup {{\mathcal {V}}}_2\cup \cdots \cup {{\mathcal {V}}}_\ell\).
The definition of directed \(\ell\)-chained graphs is quite restrictive. To be able to discuss properties of a larger set of directed graphs, we relax the requirements of Definition 1 to allow edges between vertices in the vertex subset \({{\mathcal {V}}}_i\) to vertices in vertex subset \({{\mathcal {V}}}_j\) for some \(j\le i\) with j not much smaller than i.
Definition 2
The directed graph \({{\mathcal {G}}}= \{{{\mathcal {V}}},{{\mathcal {E}}}\}\) is said to be directed \(\{\ell ,k_i\}\)-chained with initial vertex \(v_i\) if it has the chained structure described in Definition 1 with the extension that edges from vertices in the set \({{\mathcal {V}}}_j\) are allowed to point to vertices in the sets \({{\mathcal {V}}}_{\max \{j-k_i,1\}},\ldots ,{{\mathcal {V}}}_j,{{\mathcal {V}}}_{j+1}\) for \(j=1,2,\ldots ,\ell -1\) and some \(k_i\ge 0\). The integer \(k_i\), which we refer to as the lower bandwidth, is the largest integer with this property.
We note that Definition 1 corresponds to the situation when \(k_i=-1\) for all i in Definition 2.
Definition 3
The minimal lower bandwidth, k, of a directed chained graph is defined as
$$\begin{aligned} k=\min _{v_i\in {\overline{{{\mathcal {V}}}}}} k_i, \end{aligned}$$
(3.2)
where the minimum is over all initial vertices \(v_i\) in the vertex set \({\overline{{{\mathcal {V}}}}}\subset {{\mathcal {V}}}\) that gives maximal chain length \(\ell\). When k is the minimal lower bandwidth, the graph is said to be directed \(\{\ell ,k\}\)-chained.
The \(\{\ell ,k\}\)-chained structure is quite general. We conjecture that any weakly connected graph with n nodes is \(\{\ell ,k\}\)-chained for some \(n\ge \ell >k\ge -1\). A small value of k indicates that information in the graph flows in a preferred direction, with small back propagation. This structure can be investigated by means of spanning trees as described in the section “Directed chained graphs and directed spanning trees”.
Example 3.2
Consider the directed graph \({{\mathcal {G}}}\) shown in Fig. 2. It is a directed \(\{5,2\}\)-chained graph with initial vertex \(v_1\). If one removes the edge from vertex \(v_4\) to \(v_2\), the graph becomes a directed \(\{5,1\}\)-chained graph with initial vertex \(v_1\). If one continues by removing the edge from \(v_3\) to \(v_2\), then a directed 5-chained graph with the same initial vertex is obtained.
The adjacency matrix analogous to Eq. (3.1) for a directed \(\{\ell ,k\}\)-chained graph \({{\mathcal {G}}}= \{{{\mathcal {V}}},{{\mathcal {E}}}\}\) can be represented by a lower block Hessenberg matrix
$$\begin{aligned} A = \begin{bmatrix} A_{11} & A_{12} \\ A_{21}& A_{22} & A_{23} \\ \vdots & \vdots & \ddots & \ddots & \\ A_{k+1,1}& \vdots & & \ddots & \ddots \\ & A_{k+2,2} & & & \ddots & \ddots &\\ & & \ddots & & & A_{\ell -1,\ell -1} & A_{\ell -1,\ell } \\ & & & A_{\ell ,\ell -k} & \cdots & A_{\ell ,\ell -1} & A_{\ell ,\ell } \end{bmatrix}, \end{aligned}$$
(3.3)
when the nodes are suitably ordered. Here the block \(A_{ij}\) represents edges that point from the vertex subset \({{\mathcal {V}}}_i\) to the vertex subset \({{\mathcal {V}}}_j\). All superdiagonal blocks \(A_{i,i+1}\) are nonvanishing, because if all entries of the block \(A_{i,i+1}\) were zero, then there would be no edges from the vertex subset \({{\mathcal {V}}}_i\) to vertices in the subset \({{\mathcal {V}}}_{i+1}\). But this would contradict the fact that the graph \({{\mathcal {G}}}\) is directed \(\{\ell ,k\}\)-chained.
If the minimal lower bandwidth, defined by Eq. (3.2), is \(k=0\), then there is at least one edge from a node to another node in the same vertex subset. The adjacency matrix corresponding to such a graph is upper block bidiagonal when the nodes are suitably ordered. Similarly, a lower bandwidth \(k=1\) indicates that when the nodes are suitably enumerated, the adjacency matrix can be represented by a block tridiagonal matrix. More generally, a small lower bandwidth (3.2) indicates that there only are edges between vertex subsets \({{\mathcal {V}}}_j\) with close indices.
The following result shows that for strongly connected directed \(\{\ell ,k\}\)-chained graphs, directed cycles will be observed if \(k\ge 1\). For semi-connected or weakly connected directed graphs, cycles are not guaranteed to exist.
Proposition 2
Let \({{\mathcal {G}}}=\{{{\mathcal {V}}},{{\mathcal {E}}}\}\) be a strongly connected directed \(\{\ell ,k\}\)-chained graph with vertex partition \({{\mathcal {V}}}={{\mathcal {V}}}_1 \cup \cdots \cup {{\mathcal {V}}}_{\ell }\). Assume there are no edges between vertices belonging to the same vertex set and that \(k\ge 1\). Let \(e_{j,i}\in {{\mathcal {E}}}\) represent a directed edge from vertex \(v_j\) to \(v_i\), where \(v_i\in {{\mathcal {V}}}_i\) and \(v_j\in {{\mathcal {V}}}_{i+s}\) for \(1\le s\le k\). Then there exists at least one directed cycle that starts at \(v_i\), contains the edge \(e_{j,i}\), and ends at \(v_i\). The possible minimum length of the directed cycle is \(s+1\).
Proof
Since the graph \({{\mathcal {G}}}\) is strongly connected and there are no edges between any nodes in the same vertex subset, the shortest possible directed path from vertex \(v_i\) to \(v_j\) has length s as shown below
$$\begin{aligned} v_i\rightarrow v_{i_1} \rightarrow \cdots \rightarrow v_{i_{s-1}} \rightarrow v_j, \end{aligned}$$
where \(v_{i_t}\in {{\mathcal {V}}}_{i+t}\) for \(t=1,2,\ldots ,s-1\). Combining this path with the edge \(e_{j,i}\) determines a directed cycle of length \(s+1\). \(\square\)
Identification of the \(\{\ell ,k\}\)-chained structure of a directed graph (if present) sheds considerable light on properties of the graph, including the presence of anti-communities. Anti-communities are vertex subsets \({{\mathcal {W}}}_i\), \(i=1,2,\ldots ,q\), of \({{\mathcal {V}}}\) such that there are many fewer edges from nodes in \({{\mathcal {W}}}_i\) to nodes in \({{\mathcal {W}}}_i\), than from nodes in \({{\mathcal {W}}}_i\) to nodes in \({{\mathcal {W}}}_j\) for \(j\ne i\). For instance, the node subsets \({{\mathcal {V}}}_j\) of an \(\ell\)-chained graph are anti-communities. Recent discussions on anti-community detection for undirected graphs can be found in Concas et al. (2020), Estrada and Knight (2015), Fasino and Tudisco (2017). There are several methods and measures that allow one to identify communities or clusters, such as the intra-cluster density which, for undirected graphs, is defined as the ratio of the number of internal edges and the number of all possible internal edges; see Fortunato (2010). An analogous density measure for computing the anti-community score for undirected graphs was introduced in Concas et al. (2021). Here, we extend this measure to directed \(\{\ell ,k\}\)-chained graphs.
Definition 4
The anti-community score \(\rho \in [0,1]\) for a node subset \({{\mathcal {V}}}_i\) of the node set \({{\mathcal {V}}}\) of a directed \(\{\ell ,k\}\)-chained graph is the ratio of the number of directed edges between the vertices in \({{\mathcal {V}}}_i\) and the total possible number of directed edges between them. An anti-community with score \(\rho\) is said to be a \(\rho\)-anti-community.
We remark that the anti-community score aims at identifying an approximate anti-community as a node set for which \(\rho\) takes a small value. A large value of \(\rho\) does not necessarily identify a community, because it does not consider the connections between the nodes in \({{\mathcal {V}}}_i\) and those not contained in \({{\mathcal {V}}}_i\).
Example 3.3
For directed \(\ell\)-chained graphs with node subset partitioning (2.2), the subsets \({{\mathcal {V}}}_i\), for \(i=1,2,\ldots ,\ell\), are 0-anti-communities, because there are no internal edges. For a directed \(\{\ell ,k\}\)-chained graph described in Definition 2, the subset \({{\mathcal {V}}}_i\) has a positive anti-community score \(\rho _i\) when it has internal edges. If \(\rho _i\) is small, then the subset \({{\mathcal {V}}}_i\) may be considered as an approximate anti-community.