Skip to main content

Community structure based on circular flow in a large-scale transaction network

Abstract

The objective of this study is to shed new light on the industrial flow structure embedded in microscopic supplier-buyer relations. We first construct directed networks from actual data from interfirm transaction relations in Japan; as one example, the dataset compiled by the Tokyo Shoko Research, Ltd. in 2016 contains five million links between one million firms. Then, we analyze the industrial flow structure of such a large-scale network with a special emphasis on its hierarchy and circularity. The Helmholtz-Hodge decomposition enables us to break down the flow on a directed network into two flow components: gradient flow and circular flow. The gradient flow between a pair of nodes is given by the difference of their potentials obtained by the Helmholtz-Hodge decomposition. The gradient flow runs from a node with higher potential to a node with lower potential; hence, the potential of a node shows its hierarchical position in a network. On the other hand, the circular flow component illuminates feedback loops built in a network. The potential values averaged over firms classified by the major industrial category describe hierarchical characteristics of sectors. The ordering of sectors according to the potential agrees well with the general idea of the supply chain. We also identify industrially integrated clusters of firms by applying a flow-based community detection method to the extracted circular flow network. We then find that each of the major communities is characterized by its main industry, forming a hierarchical supply chain with feedback loops by complementary industries such as transport and services.

Introduction

In general, interactions between individuals are considered to play an important role in the economy. For instance, firms are connected to each other directly or indirectly through their business transactions. A firm buys materials from suppliers and sells its products to customers. These transactions are so essential to firms that one cannot isolate the dynamics of individual firms from the entire economic system. Firms’ production activities thus give rise to a complex network; also, examining economic phenomena from the perspective of networks can provide a variety of new insights into economic phenomena.

Conventionally, the industrial structure and economic ripple effects have been studied on the basis of the input-output tables (Leontief 1986). Furthermore, a network-theoretic point of view was incorporated into the input-output analysis to elucidate complex interindustrial flow structures within or across the sectors (Slater 1977; 1978; Carvalho 2008; McNerney et al. 2013; Contreras and Fagiolo 2014). However, such classification of firms by industry may be too formal for a reliable macroscopic picture of the economy.

Recently, firm-level network analyses based on a comprehensive database of interfirm transaction relations have begun to appear (Atalay et al. 2011; Acemoglu et al. 2012; Cainelli et al. 2012; Luo et al. 2012; Watanabe et al. 2015; Letizia and Lillo 2018; Goto et al. 2017). Economists as well as physicists have recognized the importance of taking an explicit account of interfirm links in order to understand economic issues, such as the origin of business cycles and the possibility of a chain reaction in firm bankruptcies.

Very recently, we have studied (Chakraborty et al. 2018) the structure of a Japanese production network with one million firms and five million supplier-customer links. We first constructed a directed production network from the actual data of interfirm transaction relations and found that they form a tightly knit structure with a giant strongly connected component surrounded by two half-shells constituting incoming-flow and outgoing-flow components for the core. The hierarchical structure of communities was then elucidated by a flow-based multilevel community detection method (Rosvall and Bergstrom 2011), and most of the irreducible communities were found to be on the second level. The composition of some of the major communities, including overexpressions of industrial and regional components, as well as hierarchical connections between the communities, was studied in detail.

The hierarchy of the production network is expected to emerge from self-organization of the supply chain in the industrial system. This is the general view on evolutionary processes in complex systems (Holland 2000; Anderson 1972). Here, we emphasize that we should also pay attention to the inner loops of production, giving rise to a nonlinear feedback mechanism in the system, because they can be engines for economic growth. The priority production system adopted by the Japanese government just after World War II is an illustrative application of this idea (Vestal 1995). It was intended to stimulate recovery of the nation’s economy so damaged by concentrating public investment into coal mining and steel production. Production of steel needs electricity generated from coal, and mining of coal needs machinery made of steel. Such an industrial loop formed by the three industries, mining, steel production, and machinery manufacturing, led to autonomous growth of the economy.

The objective of this study is to advance the previous empirical analysis (Chakraborty et al. 2018) on the industrial flow structure embedded in microscopic supplier-buyer relations with a special emphasis on its circularity. To delve further into the flow structure of the transaction network with firms as nodes, we take advantage of a mathematical tool called the Helmholtz-Hodge decomposition (Jiang et al. 2011; Bhatia et al. 2013). It allows us to decompose the flow on a directed network into a gradient flow component and a circular flow component.

This paper is organized as follows. First, we give salient features of the dataset to be analyzed for transaction relations between firms in Japan. “Helmholtz-Hodge decomposition” section is devoted to a mathematical formulation of the Helmholtz-Hodge decomposition for the present analyses. In “Bow-tie decomposition” section, we revisit the walnut structure of the Japanese production network that was found in the previous study (Chakraborty et al. 2018), using the bow-tie decomposition and a visualization technique of networks. In “Results and discussion” section, we build a network consisting of only the circular flow components and detect communities in the network to elucidate circular flow structure in the production network. The final section summarizes the results obtained here.

Interfirm transaction data

The present analysis is based on the big data of 4,974,802 transaction relations between 1,066,037 firms in Japan that was collected by the Tokyo Shoko Research, Ltd. (TSR) in 2016.Footnote 1 These data virtually cover the entire amount of industrial activities in Japan. We regard firms as nodes and transaction relations between them as directed links spanning from suppliers to customers to construct the latest production network in Japan. Since information on the volume of each transaction is not available, we assume that all the links have the same weight.

In addition to the information on transactions between firms, various attributes of individual firms are available. For simplicity of analyses, we use two attributes of each firm, namely, industrial sector and geographical location of the head office. Firms are categorized into 20 sectors and 47 prefectures. Readers are referred to the previous paper (Chakraborty et al. 2018) for more detailed information on the dataset.Footnote 2

Helmholtz-Hodge decomposition

In general, one can write flow Fij running from node i to node j in a directed network as follows:

$$ F_{ij} = F^{(\mathrm{p})}_{ij} + F^{(\mathrm{c})}_{ij}, $$
(1)

where we assume that the magnitude of Fij on the network is given by the following:

$$ |F_{ij}|=\left\{\begin{array}{ll} 1 & \text{(singly connected in one way)}\\ 0 & \text{(doubly connected in both ways)}\\ 0 & \text{(not connected)} \end{array}\right. $$
(2)

Since information on the volume of transactions is not available in the TSR dataset, we adopt such a simplified flow structure. The first term \(F^{(\mathrm {p})}_{ij}\) on the right-hand side of Eq. (1) denotes the gradient flow from node i to node j which is given by the following:

$$ F^{(\mathrm{p})}_{ij} = w_{ij}\left(\phi_{i} - \phi_{j}\right)\, $$
(3)

where ϕi is the Helmholtz-Hodge potential associated with node i and wij is a positive weight for linkage between nodes i and j. We assume that the weight wij takes the following values depending on how the two nodes are connected:

$$ w_{ij}=\left\{\begin{array}{ll} 1 & \text{(singly connected in one way)}\\ 2 & \text{(doubly connected in both ways)}\\ 0 & \text{(not connected)} \end{array}\right. $$
(4)

The Helmholtz-Hodge potential of nodes in a directed network identifies their hierarchical positions in its flow structure. In the network built with only gradient flow, nodes are perfectly ranked; the gradient flow always runs from a node with higher potential to a node with lower potential. On the other hand, the second term \(F^{(\mathrm {c})}_{ij}\) denotes the circular flow component in which incoming flow and outgoing flow are exactly balanced at each node:

$$ \sum_{j}F^{(\mathrm{c})}_{ij}=0\, $$
(5)

so that there is no hierarchy among nodes in the circular flow network. The circular flow component illuminates feedback loops embedded in the system.

Additionally, one can determine the potential ϕi for every node by minimizing the squared difference between the actual flow and the gradient flow:

$$ I = \frac{1}{2}\sideset{}{^{\prime}}\sum_{i< j} w^{-1}_{ij}\left(F_{ij}-F^{(\mathrm{p})}_{ij}\right)^{2}, $$
(6)

where the double summation excludes pairs of nodes that are not connected. This is a variational formulation of the Helmholtz-Hodge decomposition. Subtracting the gradient flow thus determined from the original flow leaves the loop flow. In addition, to remove arbitrariness in the potential determination, we impose the following condition on ϕi:

$$ \sum_{i} \phi_{i} = 0\ . $$
(7)

To quantify to what extent the flow of a directed network has hierarchical and circular characteristics, we introduce two measures for the gradient flow and circular flow components associated with each node i as follows:

$$\begin{array}{@{}rcl@{}} \xi^{(\mathrm{p})}_{i}=\frac{\frac{1}{2}\sideset{}{^{\prime}}\sum_{j}w_{ij}^{-1}\left(F^{(\mathrm{p})}_{ij}\right)^{2}}{\sideset{}{^{\prime}}\sum_{j< k}w_{jk}^{-1}\left(F_{jk}\right)^{2}}, \end{array} $$
(8)
$$\begin{array}{@{}rcl@{}} \xi^{(\mathrm{c})}_{i}=\frac{\frac{1}{2}\sideset{}{^{\prime}}\sum_{j}w_{ij}^{-1}\left(F^{(\mathrm{c})}_{ij}\right)^{2}}{\sideset{}{^{\prime}}\sum_{j< k}w_{jk}^{-1}\left(F_{jk}\right){2}}. \end{array} $$
(9)

Summation of \(\xi ^{(\mathrm {p})}_{i}\) and \(\xi ^{(\mathrm {c})}_{i}\) over all nodes yields generalized formulas for the gradient and loop ratios (Fujiki and Haruna 2014; Haruna and Fujiki 2016), respectively:

$$\begin{array}{@{}rcl@{}} \gamma&=&\sum_{i}\xi_{i}^{(\mathrm{p})}, \end{array} $$
(10)
$$\begin{array}{@{}rcl@{}} \lambda&=&\sum_{i}\xi_{i}^{(\mathrm{c})}. \end{array} $$
(11)

Because of the orthogonality between the gradient flow and the circular flow vectors, the sum of the two ratios amounts to unity:

$$ \gamma + \lambda = 1 $$
(12)

If a network is completely hierarchical (circular), γ=1 (0) and λ=0 (1). One can thus use either of the two ratios to characterize the overall flow structure of a directed network. This can be considered as ranking the nodes according to the hierarchical structure of the network.

There are a number of works on hierarchy and ranking in complex directed networks (De Bacco et al. 2018; Tatti 2015; Letizia et al. 2018; Johnson et al. 2014). In such problems, the presence of cyclic parts of the network causes ranking conflicts. Studies have been conducted in the context of how to minimize the inconsistency encountered when considering ranking in actual networks. Previous work (Tatti 2015; Letizia et al. 2018) accomplishes this by minimizing the penalty function called agony, and De Bacco et al. (2018) by minimizing the energy of the physical model. Determining ϕ to minimize Eq. (6) corresponds to the Helmholtz-Hodge decomposition. The expression of energy minimization of the physical model is one of the variants of the Helmholtz-Hodge decomposition. Generally, the method of optimizing the penalty function is computationally expensive, but the computational cost of the Helmholtz-Hodge decomposition and physical model is not so cumbersome because it only needs to solve a set of linear equations. The major difference between Helmholtz-Hodge decomposition and other methods is the Helmholtz-Hodge decomposition allows us to treat hierarchies and cycles on equal footing. The Helmholtz-Hodge decomposition has a strong advantage of providing a unified representation of the flow structure of a directed network not only in terms of hierarchy but also in terms of circularity. In this paper, we focus on circularity as well as hierarchy, taking advantage of the Helmholtz-Hodge decomposition.

Finally, we illustrate the Helmholtz-Hodge decomposition with examples of triangular transaction networks in Fig. 1. The first example shown in Fig. 1a is a completely hierarchical network with ϕA=1,ϕB=0, and ϕC=−1, while the second one in Fig. 1b is a completely circular network. The third example in Fig. 1c is a mixed network with both hierarchical and circular characteristics. Its gradient flow component, Fig. 2a, is determined by ϕA=2/3,ϕB=0, and ϕC=−2/3. The circular flow component is a loop of flow with magnitude 1/3, as shown in Fig. 2b.

Fig. 1
figure 1

Examples of triangular transaction networks. A completely hierarchical network (a), a completely circular network (b) and a mixed network with bowth hierarchical and circular characteristics (c)

Fig. 2
figure 2

Gradient flow component (a) and circular flow component (b) of the triangular network as shown in Fig. 1c, according to the Helmholtz-Hodge decomposition

Bow-tie decomposition

To elucidate flow structure in the TSR transaction network, we begin with the bow-tie decomposition of the network, which has been widely used to understand the flow structure of various complex networks including the worldwide web and metabolic networks. The decomposition classifies nodes in a directed network according to the way in which they are mutually connected: IN component, GSCC (giant strongly connected component), OUT component, and others. The GSCC is the largest group of nodes in which any pairs of nodes are connected bidirectionally by two directed paths. The IN component is a collection of nodes that have a path to the GSCC, but no reverse path to come back from the GSCC. The OUT component is defined in the other way around, that is, a collection of nodes that are reachable only from the GSCC. From their definition, these classifications of nodes provide an overall view on the hierarchical structure of the network. In the previous paper (Chakraborty et al. 2018), however, we named such a structure of the TSR network the “walnut” structure instead of the bow-tie structure after its shape. Because the IN and OUT components are not as separated as the two wings of a bow-tie, they are more similar to two halves of a walnut shell, surrounding the central GSCC core.

Table 1 lists the numbers of firms belonging to the IN, GSCC, OUT and other components of the TSR network. The results are compared with the corresponding numbers of firms averaged over 1000 random networks with the same degree distribution as that of the original network. We observe no significant difference in the bow-tie parameters between the original and randomized networks. However, Table 1 also shows that complete randomization of the network destroys the bow-tie structure; virtually all nodes constitute the GSCC.

Table 1 The numbers of firms belonging to the IN, GSCC, OUT and other components of the TSR transaction network

In contrast, the distributions of the Helmholtz-Hodge potential shown by the histogram in Fig. 3 exhibit a significant difference between the two deep networks; the flow structure of the network is influenced by the randomization process.

Fig. 3
figure 3

Distributions of the Helmholtz-Hodge potential for firms in the IN component (red), GSCC (green), and OUT component (blue) of the TSR transaction network. The left and right panels show the results for the original network and one sample of the randomized networks with the same degree sequence, respectively

The potential distributions of IN, GSCC, and OUT in the original network are well-overlapped compared with the randomized network with the same degree sequence. In particular, the distributions of IN and OUT of the randomized network are quite separated, but the corresponding distributions of the original network are substantially overlapped. For a quantitative argument, we define the following overlap integral of two distribution functions:

$$ J=\frac{\int f(x)\cdot g(x)dx}{\sqrt{\int f^{2}(x)dx}\sqrt{\int g^{2}(x)dx}}. $$
(13)

The overlap integral J takes a value within the range of 0≤J≤1; J takes the unity for f(x)g(x). In the TSR network, the overlap integral of the potential distributions for the IN and OUT components is J(TSR)=0.125. On the other hand, randmized networks with the same degree sequence take much smaller values of J, for instance, \(J^{(\text {rand})}_{0.01} = 0.00021\) and \(J^{(\text {rand})}_{0.05} = 0.00020\), where \(J^{(\text {rand})}_{0.01}\) and \(J^{(\text {rand})}_{0.05}\) are the 1% and 5% significance level of J for 1,000 samples, respectively. These numerical results establish the conceptual difference between the bow-tie structure and the walnut structure, as shown in Fig. 4, in a statistically meaningful way. We emphasize that the structure is not essentially determined by the degree distribution, but by more detailed properties on the linkage of the network.

Fig. 4
figure 4

Illustration of the general ideas on the walnut structure (left) and the bow-tie structure (right)

Figure 5 shows the distribution of firms in such bow-tie components of the TSR network across sectors. The sectors such as Construction, Information & Communications, and Scientific Research, Professional & Technical Services are important constituents in the IN component. Mining, Manufacturing, Transport & Postal, and Wholesale sectors are key players in the GSCC. The important sectors in the OUT component include Retail Trade, Finance & Insurance, Accommodations, Eating/Drinking Services, Living-related/Personal & Amusement Services, and Education, Learning Support. We thus see that each component of the bow-tie structure in the production network has its own industrial characteristics. The main industries in the GSCC form an integrated core of economic activities in Japan.

Fig. 5
figure 5

Distribution of firms in the IN (red), GSCC (green), OUT (blue) and other components (gray) of the TSR transaction network across sectors

Results and discussion

We first obtained an optimized layout of the network in three-dimensional space by incorporating information of the Helmholtz-Hodge potential for individual nodes. The result is displayed in Fig. 6. Nodes are aligned in the z direction according to their values of the Helmholtz-Hodge potential; basically, transaction flows are from top to bottom. On the other hand, the x and y coordinates of nodes are determined by minimizing the potential energy in a spring-electric model in which nodes with direct transaction relations are connected to each other by a spring and all nodes have an identical electric charge to maintain distance from disconnected nodes. In Fig. 6, nodes belonging to the different walnut components are distinguished with different colors. Figure 7 shows half-cut cross-sections of the 3D images of the network, as shown in Fig. 6. The walnut structure is also clearly visible in this visualization. The GSCC is certainly sandwiched between the IN component on the upstream side and the OUT component on the downstream side. However, the potential values in the three components are distributed so widely that even the potential distributions of the peripheral components are not well separated. These results agree with our naming convention of the flow structure of the transaction network as a walnut structure.

Fig. 6
figure 6

The IN component (red), GSCC (green), and OUT component (blue) of the TSR transaction network whose layout has been optimized in three-dimensional space. Nodes are aligned in the z direction according to their values of the Helmholtz-Hodge potential; basically, transaction flows are from top to bottom. On the other hand, the x and y coordinates of nodes were determined by the energy minimum principle with a spring-electric model

Fig. 7
figure 7

Half-cut cross-sections of the 3D images of the TSR network as shown in Fig. 6

Figure 8 resolves the three potential distributions in Fig. 3 into those within individual sectors. The averaged value of the potential for firms in each sector is given in Table 2. The results, listed in their descending order, describe hierarchical characteristics of the sectors in the transaction network. For instance, the manufacturing sector is located at the upstream side compared with the wholesale and retail trade sectors. The hierarchical ordering of sectors is in harmony with the general idea of the supply chain. However, the potential values are widely distributed from upstream to downstream even within the same sectors, except for Finance & Insurance, Medical, Health Care & Welfare, and Government. This fact indicates that major sectors such as Manufacturing, Construction, Wholesale and Retail trades have appreciable hierarchical structure in and of themselves.

Fig. 8
figure 8

Resolution of the three potential distributions in the left panel of Fig. 3 into those within sectors

Table 2 The averaged values of the Helmholtz-Hodge potential for firms in individual sectors, which are listed in their descending order corresponding to the direction of upstream to downstream in the TSR transaction network

We turn own attention to the gradient ratio γ and loop ratio λ for the whole network and the GSCC of the transaction network.Footnote 3 The results are shown in Table 3 together with the corresponding results for the two kinds of random networks in parallel with Table 1. The hierarchy is significantly developed in the original network compared with the randomized networks. This is understandable because hierarchical structure is in general a manifestation of self-organization in complex systems (Holland 2000; Anderson 1972); it is a formation of supply chains in the economic system. Although randomizing the network with a preserved degree distribution does not change the walnut structure considerably, the randomization procedure has an appreciable influence on the balance between the hierarchy and circularity of the network. We have similar results for the GSCC of the original network. The hierarchy is slightly stronger than the circularity even in the GSCC consisting only of nodes that are mutually connected in both ways. In contrast, the circularity dominates the flow structure of the corresponding network that has been completely randomized.

Table 3 Gradient ratio γ and loop ratio λ for the whole network and the GSCC of the TSR transaction network

The hierarchical flow is dominant in the IN component, which has mainly one-way flow to the GSCC because of its definition. This is also true for the OUT component. On the other hand, the GSCC has a more complicated flow structure; both hierarchical and circular flow components coexist in it. This is because any pairs of nodes in the GSCC are connected bidirectionally by at least two directed paths. We thus expect that firms in the GSCC constitute the core of the production activities, while firms in the IN and OUT parts, forming a thin layer for the GSCC, are just peripherals.

For the purpose of this study, therefore, we hereafter concentrate on the flow structure of the GSCC, especially its circularity. To identify important loops in the circular flow network on the GSCC, we adopt the map equation method (Rosvall and Bergstrom 2008; 2011) for community detection. It is an information-theoretic method based on an idea that random walkers should stay in looping communities for a long time. Figure 9 demonstrates that the communities so detected have a size distribution of the long-tail form. The total number of communities is 18,660, and the largest community has approximately 5000 firms. Figure 10 depicts the adjacency matrix of the circular flow network in which nodes are ordered according to the community assignment. It shows the community detection works well because links are sparse between the communities and are considerably dense within the communities. The 10 largest communities are illuminated in Fig. 11 with the same node configuration as in Figs. 6 and 7.

Fig. 9
figure 9

Size distribution of communities in the circular flow network on the GSCC of the transaction network

Fig. 10
figure 10

The adjacency matrix of the circular flow network sorted in descending order regarding the community size, where it shows the top 100 largest communities

Fig. 11
figure 11

The 10 largest communities in the circular flow network on the GSCC of the transaction network, visualized in three-dimensional space with three different points of view. The same configuration of firms is used as in Fig. 6.

Figure 12 shows the histogram of the Helmholtz-Hodge potential difference Δϕ of links for the 1st-6th communities, where Δϕ=ϕiϕj is the potential difference between nodes i and j at both ends of links Fij(>0). A positive value of Δϕ indicates a link directed from the upstream side to the downstream side, while a negative value of Δϕ, a link in the reversed direction. The distribution of Δϕ is significantly shifted to the positive side except for the 5th community. It means that the main flow from the upstream to the downstream dominates over the feedback flow in those communities. On the other hand, the 5th community shows the distribution of Δϕ that is quite symmetrical around Δϕ=0. This indicates that the exceptional community has well-developed circular flow structure, which will be addressed later.

Fig. 12
figure 12

Histogram of the potential difference Δϕ=ϕiϕj associated with link Fij in the community. A positive value of Δϕ indicates a link directed from the upstream side to the downstream side, while a negative value of Δϕ, a link in the reversed direction

One can characterize the major communities by industrial and regional affiliations of their constituent firms. Figure 13 shows the industrial characterization of the 10 largest communities. They are divided into two contrastive groups. The first, second, fourth and fifth largest communities are mainly featured by manufacturing and wholesale sectors; the medical, health care & welfare sector is additionally important for the fourth community. On the other hand, the remaining 6 communities are featured by the construction sector. Figure 14 shows the regional characterization of the 10 largest communities. Additionally, all of the major communities have prominent regional characteristics. The manufacturing- and wholesale-dominant communities are basically metropolitan communities except for the second largest community, in which Hokkaido and some provincial prefectures play a key role. In contrast, the distribution of the regional affiliations in the construction-dominant communities are well-localized at prefecture level. All prefectures except Nara have at least one of their own construction communities of more than 100 firms.

Fig. 13
figure 13

Industrial characterization of the 10 largest communities obtained for the circular flow network on the GSCC by share of sectors to which their constituent firms belong. The size of each square is proportional to share of the corresponding industry in the specified community. Additionally, its share value (in percentage) is represented by the color coding

Fig. 14
figure 14

Regional characterization of the 10 largest communities obtained for the circular flow network on the GSCC by share of prefectures in which their constituent firms are located. The size of each square is proportional to share of the corresponding prefecture in the specified community. Additionally, its share value (in percentage) is represented by the color coding

We will look into the 6 largest communities in more detail. Tables 4 and 5 list the number of firms in each industry type that belong to the 6 largest communities. They are grouped by the middle classification of the TSR industry classification scheme (99 industries). Communities 1, 2, 4, and 5 were groups of firms in which manufacturing and wholesales dominate. Community 1 includes many firms in the manufacture and wholesale of textile and apparel industries. Community 2 is mainly composed of firms in the fisheries cooperative, wholesale and retail trade of seafood, and the manufacture of food from seafood. Community 4 is characterized by medical and health services, and the manufacture and wholesale of pharmaceutical products with representative firms. Most of the medical and health services are general hospitals and clinics. Community 5 contains many firms in the manufacture and wholesale of metal products and construction. On the other hand, most of the firms in communities 3 and 6 are those in the construction industry. Although the industrial distributions of the construction communities resemble each other closely, they are clearly distinguished by their regional characteristics. In fact, communities 3 and 6 are dominated by firms in Okinawa and Kagoshima, respectively. We note that Okinawa is considerably isolated from the mainland in the whole industry.

Table 4 Number of firms for each industry type that belongs to the 6 largest communities
Table 5 Continuation of Table 4

In this way, one can characterize the manufacture and wholesale communities by their own products. Although the communities thus extracted represent dense parts of the loop flow network, they mainly include a series of firms in the production line from manufacturing to retail trade through wholesales, the so-called supply chain, themselves forming a hierarchical flow structure.

We are now in a position to identify firms that strongly contribute to the feedback structure in the major communities. For this purpose, we use the ratio \(\xi _{i}^{(\mathrm {c})}/\xi _{i}^{(\mathrm {p})}\) of the circular and gradient flow components for each node i. A firm characterized by a large value of this ratio proves to be important for the circular flow structure of the communities. Details of this method are given with an illustrative example in the Appendix A. Tables 6 and 7 list the number nc of firms for each industry in the community, the number \(n_{c}^{\prime }\) of firms with the top 10% of the \(\xi _{i}^{(\mathrm {c})}/\xi _{i}^{(\mathrm {p})}\) value within community and the ratio \(n_{c}^{\prime }/n_{c}\) of them. In community 1, the nc of the manufacture and wholesale of textile and apparel are large, but their \(n_{c}^{\prime }\) values are low, so that the corresponding \(n_{c}^{\prime }/n_{c}\) values are lower than those of other industries. This indicates that firms of the manufacture and wholesale of textile and apparel have strong hierarchy. On the other hand, the \(n_{c}^{\prime }/n_{c}\) value of the industry that is not related to textiles or apparel is high. Therefore, this fact shows that a relatively small number of firms that are not related to the industry that characterizes this community are in an important position for the circular structure. In community 2, the manufacture of food and wholesale trade (food and beverages) exhibit high hierarchy, and the miscellaneous retail trade and construction work, which is general work including public and private construction work, have high circularity. In communities 3 and 6, firms in the construction industry form the main stream flow. On the other hand, firms in the manufacture of ceramic, stone and clay products (community 3) or fabricated metal products (community 6), miscellaneous wholesale trade and road freight transport have large contributions to its feedback structure. Community 4 is a community in which medical and health services dominate, but the number of firms of medical and health services is only 7 out of 956 firms with the top 10% of the \(\xi _{i}^{(\mathrm {c})}/\xi _{i}^{(\mathrm {p})}\) value. Additionally, the manufacture and wholesale of pharmaceutical products exhibits high hierarchy, but technical services and advertising indicate high circularity. In community 5, construction work and wholesale trade (building materials, minerals and metals, etc.) show high hierarchy, while miscellaneous wholesale trade, equipment installation work and road freight transport largely contribute to the circularity. In fact, the miscellaneous wholesale trade particularly includes iron and steel primary product wholesale, steel crude product wholesale and iron scrap wholesale trade, indicating that recycling steps of iron and steel are incorporated into the steel industry. This is why the flow structure of community 5 is so highly circular, as has been demonstrated.

Table 6 Table of the number of firms nc for each industry type in the community, and the number of firms \(n_{c}^{\prime }\) that has the top 10% of the ξ(c)/ξ(p) value within the community
Table 7 Continuation of Table 6

As we have seen, the production network is certainly quite hierarchical, but the GSCC contains the circular flow component to an appreciable extent. In general, it is hard to reveal the circular flow structure embedded in the production network because it is hidden by its strong hierarchy. To overcome the difficulty, we exclusively examined the circular flow component and then applied the community detection to the network thus constructed. Each of the communities that we have detected consist of a hierarchical supply chain of the main industry and feedback loops formed by firms in industries that complement the main industry. Specifically, we found that the transport industry plays an important role in forming the feedback structure for many of the major communities in the production network. In the previous study (Chakraborty et al. 2018), on the other hand, the original flow network was decomposed into communities. Consequently, we were unsuccessful in detecting such industrially integrated clusters of firms as have been reported here.

Conclusions

The comprehensive dataset of interfirm transaction relations in Japan enabled us to study the industrial flow structure of the nation’s production network with a sound microscopic foundation. Particularly, we emphasized its hierarchy and circularity. The network was first decomposed into the walnut components according to their flow properties: IN, GSCC, OUT, and others. The flow structure of the walnut components except the GSCC is mainly hierarchical. By adopting the Helmholtz-Hodge decomposition, we separated the flow structure of the GSCC of the network into two components: gradient flow and circular flow. The gradient flow between a pair of firms is given by the difference of their potentials, and hence, the potential of a firm identifies its hierarchical position in the transaction network. On the other hand, the circular flow component illuminates feedback loops built in the network. The potential values averaged over firms classified by the major industrial category describe hierarchical characteristics of sectors. The order of sectors determined by the potential calculation agrees well with the general idea of the supply chain. We also identified dominant clusters of firms forming feedback loops by applying the map equation method to the extracted circular flow network. We found that both hierarchical and loop structures coexist within the major sectors, such as construction, manufacturing, and wholesales. We measured the magnitude of the contribution to the circular structure from each firm in the major communities. The measurement indicates that the main industry that characterizes the community exhibits high hierarchy and low circularity. On the other hand, most of the firms that contribute to the circular flow structure belong to industries complementary to the main industry, such as the transportation industry. These results suggest limitations of the conventional industrial classification scheme in analyzing economic activities, which may be replaced by a new classification scheme for firms based on the actual interfirm transactions.

Appendix A: Identification of nodes significantly contributing to the feedback structure.

The major communities in the production network have many links from the upstream side to the downstream side and fewer feedback links from the downstream side to the upstream side. Here, we seek a method to identify nodes that contribute significant feedback from the downstream side to the upstream side in such partial networks. As an illustrative example, we adopt the model network given in Fig. 15, composed of m layers between the most upstream node and the most downstream node with n parallel flows passing through them and one reversed flow forming the feedback structure. This network can be decomposed into the gradient flow and the circular flow components by the Helmholtz-Hodge decomposition, as illustrated in Fig. 15. For this gradient flow component, the potential difference Δϕ between the adjacent layers takes the constant value of (n−1)/(n+1) throughout the intermediate layers, and the potential difference between the most upstream and the most downstream is thereby (m+1)Δϕ. All of the gradient flows are thus directed from top to bottom with flux of Δϕ. In the circular flow component, each of the n parallel flows is also directed downward with flux of c=2/(n+1). These parallel flows join at the bottom node and go back to the top node to form a feedback loop with flux of nc=2n/(n+1). On the feedback path, the direction of the loop flow is the same as that of the original flow, while that of the gradient flow is reversed from the original direction (Fij and \(F^{(\mathrm {c})}_{ij}\) have the same sign, while \(F^{(\mathrm {p})}_{ij}\) has the opposite sign). Quantitatively, the circular flow component plays a relatively important role for a node on the feedback path compared with that for a node on the main stream lines. This is as guaranteed by the following calculations.

Fig. 15
figure 15

A representative model network with feedback structure for the major communities in the transaction network and its Helmholtz-Hodge decomposition

Thus, the relative magnitude of \(\xi _{i}^{(c)}\) in reference to \(\xi _{i}^{(p)}\) for every node i is a good measure to identify the nodes that are important in the feedback structure.

For a node i on the feedback path, \(\xi _{i}^{(p)}\) and \(\xi _{i}^{(c)}\) are calculated as follows:

$$\begin{array}{@{}rcl@{}} \xi_{i}^{(p)} = \frac{|\Delta\phi|^{2}}{\sum_{i< j}|F_{ij}|^{2}} \propto \left(\frac{n-1}{n+1}\right)^{2}, \end{array} $$
(14)
$$\begin{array}{@{}rcl@{}} \xi_{i}^{(c)} = \frac{(nc)^{2}}{\sum_{i< j}|F_{ij}|^{2}} \propto \frac{4n^{2}}{(n+1)^{2}}, \end{array} $$
(15)

hence:

$$ \frac{\xi_{i}^{(c)}}{\xi_{i}^{(p)}} = \frac{4n^{2}}{(n-1)^{2}}. $$
(16)

For a node j on the main stream lines, \(\xi _{j}^{(c)}/ \xi _{j}^{(p)}\) is likewise given by the following:

$$ \frac{\xi_{j}^{(c)}}{\xi_{j}^{(p)}} = \frac{4}{(n-1)^{2}}. $$
(17)

If n1, then Eq. (16) takes a much larger value than Eq. (17).

Availability of data and materials

The data that support the findings of this study are available from Tokyo Shoko Research, Ltd., but restrictions apply to the availability of these data, which were used under license for the current study, and therefore are not publicly available. However, data are available from the authors upon reasonable request and with permission of Tokyo Shoko Research, Ltd.

Notes

  1. This is the largest connected component in the network obtained from the original data, containing 99.3% of all active firms listed in the data.

  2. In this study, sector I (Wholesale & retail trade) in (Chakraborty et al. 2018) was separated into two sectors, Wholesale and Retail trade.

  3. We reiterated the Helmholtz-Hodge decomposition of the GSCC for its gradient and loop ratios.

Abbreviations

GSCC:

Giant Strongly Connected Component

TSR:

Tokyo Shoko Research

References

  • Acemoglu, D, Carvalho VM, Ozdaglar A, Tahbaz-Salehi A (2012) The network origins of aggregate fluctuations. Econometrica 80(5):1977–2016.

    Article  MathSciNet  Google Scholar 

  • Anderson, PW (1972) More is different. Science 177(4047):393–396.

    Article  Google Scholar 

  • Atalay, E, Hortacsu A, Roberts J, Syverson C (2011) Network structure of production. Proc Natl Acad Sci 108(13):5199–5202.

    Article  Google Scholar 

  • Bhatia, H, Norgard G, Pascucci V, Bremer P-T (2013) The Helmholtz-Hodge decomposition—a survey. IEEE Trans Vis Comput Graph 19(8):1386–1404.

    Article  Google Scholar 

  • Cainelli, G, Montresor S, Vittucci Marzetti G (2012) Production and financial linkages in inter-firm networks: structural variety, risk-sharing and resilience. J Evol Econ 22(4):711–734.

    Article  Google Scholar 

  • Carvalho, VM (2008) Aggregate Fluctuations and the Network Structure of Intersectoral Trade. The University of Chicago, Chicago.

    Google Scholar 

  • Chakraborty, A, Kichikawa Y, Iino T, Iyetomi H, Inoue H, Fujiwara Y, Aoyama H (2018) Hierarchical communities in the walnut structure of the japanese production network. PloS ONE 13(8):0202739.

    Article  Google Scholar 

  • Contreras, MGA, Fagiolo G (2014) Propagation of economic shocks in input-output networks: A cross-country analysis. Phys Rev E 90(6):062812.

    Article  Google Scholar 

  • De Bacco, C, Larremore DB, Moore C (2018) A physical model for efficient ranking in networks. Sci Adv 4(7):8260.

    Article  Google Scholar 

  • Fujiki, Y, Haruna T (2014) Hodge decomposition of information flow on complex networks In: Proceedings of the 8th International Conference on Bioinspired Information and Communications Technologies, 103–112.. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), Gent.

    Google Scholar 

  • Goto, H, Takayasu H, Takayasu M (2017) Estimating risk propagation between interacting firms on inter-firm complex network. PloS ONE 12(10):0185712.

    Article  Google Scholar 

  • Haruna, T, Fujiki Y (2016) Hodge decomposition of information flow on small-world networks. Frontiers Neural Circ 10:77.

    Google Scholar 

  • Holland, JH (2000) Emergence: From Chaos to Order. OUP Oxford, Oxford.

    MATH  Google Scholar 

  • Jiang, X, Lim L-H, Yao Y, Ye Y (2011) Statistical ranking and combinatorial Hodge theory. Math Program 127(1):203–244.

    Article  MathSciNet  Google Scholar 

  • Johnson, S, Domínguez-García V, Donetti L, Muñoz MA (2014) Trophic coherence determines food-web stability. Proc Natl Acad Sci 111(50):17923–17928.

    Article  Google Scholar 

  • Letizia, E, Lillo F (2018) Corporate payments networks and credit risk rating. Available at SSRN 3075019.

  • Leontief, W (1986) Input-output Economics. Oxford University Press, Oxford.

    Google Scholar 

  • Letizia, E, Barucca P, Lillo F (2018) Resolution of ranking hierarchies in directed networks. PloS ONE 13(2):0191604.

    Article  Google Scholar 

  • Luo, J, Baldwin CY, Whitney DE, Magee CL (2012) The architecture of transaction networks: a comparative analysis of hierarchy in two sectors. Ind Corp Chang 21(6):1307–1335.

    Article  Google Scholar 

  • McNerney, J, Fath BD, Silverberg G (2013) Network structure of inter-industry flows. Phys A Stat Mech Appl 392(24):6427–6441.

    Article  Google Scholar 

  • Rosvall, M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci 105(4):1118–1123.

    Article  Google Scholar 

  • Rosvall, M, Bergstrom CT (2011) Multilevel compression of random walks on networks reveals hierarchical organization in large integrated systems. PloS ONE 6(4):18209.

    Article  Google Scholar 

  • Slater, P (1977) The determination of groups of functionally integrated industries in the united states using a 1967 interindustry flow table. Empir Econ 2(1):1–9.

    Article  Google Scholar 

  • Slater, P (1978) The network structure of the united states input-output table. Empir Econ 3(1):49–70.

    Article  Google Scholar 

  • Tatti, N (2015) Hierarchies in directed networks In: 2015 IEEE International Conference on Data Mining, 991–996.. IEEE, New York.

    Chapter  Google Scholar 

  • Vestal, JE (1995) Planning for Change: Industrial Policy and Japanese Economic Development 1945-1990. Clarendon Press, Oxford.

    Book  Google Scholar 

  • Watanabe, T, Uesugi I, Ono A (2015) The Economics of Interfirm Networks, vol. 4. Springer, New York.

    Book  Google Scholar 

Download references

Acknowledgments

This study has been conducted as a part of the project “Large-scale Simulation and Analysis of Economic Network for Macro Prudential Policy” undertaken at the Research Institute of Economy, Trade and Industry (RIETI). This research used computational resources of the K computer provided by the RIKEN Center for Computational Science through the HPCI System Research project (Project ID: hp170242, hp180177).

Funding

This research was also supported by MEXT as Exploratory Challenges on Post-K computer (Studies of Multilevel Spatiotemporal Simulation of Socioeconomic Phenomena) and JSPS KAKENHI Grant Numbers JP15KT0052, JP17KT0034, and JP18K03451.

Author information

Authors and Affiliations

Authors

Contributions

The paper was written collaboratively. The coauthors read and approved the final manuscript.

Corresponding author

Correspondence to Yuichi Kichikawa.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kichikawa, Y., Iyetomi, H., Iino, T. et al. Community structure based on circular flow in a large-scale transaction network. Appl Netw Sci 4, 92 (2019). https://doi.org/10.1007/s41109-019-0202-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41109-019-0202-8

Keywords