# Dynamic correlation network analysis of financial asset returns with network clustering

- Takashi Isogai
^{1, 2}Email author

**Received: **16 February 2017

**Accepted: **2 May 2017

**Published: **23 May 2017

## Abstract

In this study, we propose a novel approach to analyze a dynamic correlation network of highly volatile financial asset returns by using a network clustering algorithm to deal with high dimensionality issues. We analyze the dynamic correlation network of selected Japanese stock returns as an empirical study of the correlation dynamics at the market level by applying the proposed method. Two types of network clustering algorithms are employed for the dimensionality reduction. Firstly, several stock groups instead of the existing business sector classification are generated by the hierarchical recursive network clustering of filtered stock returns in order to overcome the high dimensionality problem due to the large number of stocks. The stock returns are then filtered in advance to control for volatility fluctuations that can distort the correlation between stocks. Thus, the correlation network of individual stock returns is transformed into a correlation network of group-based portfolio returns. Secondly, the reduced size of the correlation network is extended to a dynamic one by using a model-based correlation estimation method. A time series of adjacency matrices is created on a daily basis as a dynamic correlation network from the estimation results. Then, the correlation network is summarized into only three representative correlation networks by clustering along the time axis. Some intertemporal comparisons of the dynamic correlation network are conducted by examining the differences between the three sub-period networks. Our dynamic correlation network analysis framework is not limited to stock returns, but can be applied to many other financial and non-financial volatile time series data.

### Keywords

Financial asset returns Correlation network Dynamic correlation Network clustering Dimensionality reduction## Introduction

A large number of financial assets including stocks and foreign exchange rates are traded in a financial market. The correlation of individual asset returns (price changes) is one of the key issues to understand the financial market structure. A deeper understanding of the market-wide correlation structure helps us improve financial portfolio management, leading to efficient risk management for investors as well as the authorities responsible for macro financial stability. There is a substantial body of work on correlation networks that analyzed complicated interactions and market structure of financial asset. Mantegna (1999) developed a correlation network of US stock returns by calculating the cross correlation of returns to discover the hierarchical structure of the correlation network. Similar network-based analyses have been conducted including Tumminello et al. (2010), which studied how to obtain hierarchical networks from a correlation matrix of financial asset returns including stock prices; the correlation-based clustering procedure was implemented to explore the hierarchical tree structure of the system. Chi K et al. (2010) also studied a large correlation network that included all US stock returns to examine the interdependence structure of returns. They identified a small number of stocks that has very strong influence on returns of other stocks from their correlation network analysis. Onnela et al. (2003) focused on the dynamics of market correlations. Their study on a time-dependent correlation network of the US stock return data showed that the topological structure of the network is robust with respect to time, while strong market correlation is identified during crisis periods. Preis et al. (2012) also quantified state-dependent correlations in stock markets to know if correlations are not constant but instead vary in time. Their empirical study on major US stocks showed that a higher level of average correlations were observed during market stress periods. More recently, Kenett et al. (2015) applied partial correlation analysis to stock prices by using dependency network to uncover dependency and influence relationship between individual stocks. Their empirical study based on stock prices revealed that one stock can be influenced by different sectors outside of its primary sector classification. They also found that developed markets such as the US, UK, and Japan exhibit higher degree of market stability.

When analyzing the correlation structure at the market level, the number of financial assets can become a serious technical constraint on any empirical analysis. If the number of individual asset is very large as in the case of stock market, the number of pairs of assets may become too large to observe individual relationships between them. Such dimensionality issue together with significantly volatile price movements of financial asset returns should be appropriately controlled when conducting correlation network analysis of financial asset returns. Our previous research (Isogai 2014, 2016) has proposed approaches to overcome such difficulties by applying network theory combined with advanced econometric models. Such a network-based analysis framework is useful to cope with the complex correlation structure between individual asset returns when the interaction of highly volatile asset returns is appropriately considered in network building. Specifically, Isogai (2014) proposed a novel approach to the clustering of a large correlation network by using recursive hierarchical network division. This method is able to find a grouping of highly correlated asset returns that depends only on the adjacency matrix converted from the correlation matrix of filtered returns. Later work also extended the correlation network analysis framework to a dynamic one, in which conditional correlations are estimated to express a dynamically changing network of asset returns (see Isogai (2016)). This method has proven to be useful, especially for the change point detection of the correlation structure.

In this study, we combine the two aforementioned methods to provide summary information on the possible dynamic changes in the correlation structure of a large number of asset returns. Our proposal comprises two types of dimensionality reductions: the first one provides summary information on the group structure of asset returns, while the second provides summary information on the intertemporal differences in the correlation structure. Such a twofold dimensionality reduction is useful for investors and regulatory authorities to find out more about the dynamic changes of a large and complicated financial market. We then apply the proposed method to a large Japanese stock dataset as a typical empirical case study of correlation network analysis.

The remainder of this paper is organized as follows. “Correlation network of financial asset returns” describes the filtering process used to control for the volatility fluctuations of stock returns as well as the network clustering algorithm for filtered returns used to build a reduced-size correlation network at the market level. “Dynamic changes in correlation networks” describes the dynamic conditional correlation (DCC) model used to estimate the time-dependent correlation matrices of stock returns; then, a dynamic correlation network of stock returns is built according to the estimation results of the DCC model. “Comparative analysis of the dynamic correlation network” describes the dimensionality reduction of a time series of adjacency matrices by using low rank tensor decomposition and a subspace clustering algorithm along the time axis. The dynamic correlation network is categorized into three sub-period networks and a comparative analysis is conducted. “Discussion and conclusion” discusses the further enhancement of our methods and possible extension of our dynamic correlation network analysis to other financial and non-financial time series data.

## Correlation network of financial asset returns

A correlation network is a network whose adjacency matrix is built on the basis of pairwise correlations between variables. The interaction between individual stock returns can be regarded as a correlation network whose adjacency matrix is constructed from the correlation matrix of those returns. The nodes of the correlation network are stocks that have edges weighted by the degree of the pairwise correlation of returns. Following the literature, we focus on the contemporaneous co-movement of returns, since this plays a key role in the risk–return relationship of an asset portfolio. The network is, therefore, an undirected and weighted network.

This study aims to establish an efficient way in which to observe possible changes in a large-scale correlation structure of financial asset returns. In financial markets, the correlation between asset prices can change dynamically in response to the trading activities of market participants. The correlation network therefore needs to be extended from a static one to a dynamic one, in which the adjacency matrix changes dynamically depending on time.

However, establishing such a method is beset by difficulties. The first and most important point when building a correlation network is how to estimate the pairwise correlations (edge weights) precisely. The correlation network is based on an observable measure of correlations; many financial time series tend to exhibit volatile features that make carrying out the statistical procedures used to estimate the correlations difficult. If the estimated correlation is distorted, any analysis of the correlation network would be misleading. We thus apply statistical filtering to the return data when calculating the correlation of our returns.

The second problem is how to handle the high dimensionality of financial data. In this study, we use data on Japanese stock returns listed on the Tokyo Stock Exchange (TSE). More than 1,700 stocks are listed on the First Section of the TSE; the total number of all listed stocks is more than 3,000 in Japan. The first dimensionality issue arising from the number of assets is that it is hard to handle such a large correlation matrix.

Another dimensionality problem is related to the observation frequency of price data, which determines the temporal resolution of dynamic change detection. More frequent price data enable a more precise analysis of dynamic changes; however, this increases the dimensionality alongside the time axis. To deal with these two high dimensionality issues, dimensionality reduction by clustering the correlation network in terms of size and time is implemented, respectively. We begin by describing the filtering method used to calculate the correlations of volatile financial asset returns in the following section.

### Filtering volatile asset returns

Many financial asset returns have fat-tailed return distributions in that large-scale volatility shocks are observed frequently as discussed in many previous research works including Mandelbrot (1963) and Cont (2007). A market-wide shock such as a financial crisis as well as smaller shocks that affect part of the market can distort the calculation of asset returns from sample data, as discussed in Isogai (2014, 2016). Such synchronized volatility shocks between multiple asset returns can cause an overestimation of the correlation. It is therefore crucial to control for the volatility fluctuations of asset returns when calculating the correlation matrix, which is converted into an adjacency matrix later.

where r
_{
t
} is a vector of the asset returns, μ
_{
t
} is a vector of the conditional means, ε
_{
t
} is a vector of the unpredictable residuals, H
_{
t
} is an *N*×*N* (*N*: the number of returns) symmetric positive-definite matrix, which is a conditional variance–covariance matrix of r
_{
t
}, and z
_{
t
} is a vector of the i.i.d. standardized residuals, the mean and variance of which are 0 and I
_{
N
} (an identity matrix of order *N*), respectively.

_{ i }and B

_{ j }are diagonal matrices. The volatility part is modeled as

where ⊙ denotes the Hadamard operator (the entry-wise product), h
_{
t
} is the diagonalized matrix of H
_{
t
}, and both S
_{
i
} and T
_{
j
} are diagonal matrices. Volatility is modeled without interaction between the assets to simplify the model.

What is important in (1) is that the volatility fluctuation can be separated from returns r
_{
t
} as the elements of h
_{
t
} such as \(\left (\sqrt {h_{11{\cdot }t}},\ \ldots,\ \sqrt {h_{NN{\cdot }t}}\right)\). Thus, we can safely estimate the linear correlation of returns r
_{
t
} as correlation matrix R by calculating the sample pairwise correlation of filtered residuals z
_{
t
}, since h
_{
t
} as well as μ
_{
t
} have no effect on the Pearson linear correlation coefficient of r
_{
t
} defined as \(\rho _{r_{i},r_{j}}\) =\(\frac {\text {Cov}\left (r_{i},r_{j}\right)}{\sqrt {\text {Var}\left (r_{i}\right)\cdot \text {Var}\left (r_{j}\right)}}\). Note that we use a static correlation matrix that is assumed to be constant during the observation period at this stage.

When fitting the model to our dataset, we assume that the distribution of individual residual *z*
_{
i
} is a normal, Student *t*, or skew *t* distribution, allowing for some fat-tailedness even after the volatility filtering. The model parameters are estimated as maximum likelihood estimators (MLEs) for each asset independently, since the model does not include any interaction between the assets as shown in (2) and (3). This estimation process works efficiently, especially with a large number of assets. We have now established a way in which to overcome the distortion problem of the linear correlation caused by volatility fluctuations when estimating the correlation of volatile asset returns. For more technical details on the estimation procedures, see Isogai (2014).

where A
_{
ij
} is the (*i,j*)_{
th
} entry of weighted adjacency matrix A; *cor*(*x*
_{
i
}, *x*
_{
j
}) is the (*i,j*)_{
th
} entry of correlation matrix R; and *cor*
_{
thres(i, j)} is the cutoff point of the edge weight. All the diagonal elements of A
_{
ij
} are 0, since no self-edge is considered in the correlation network. An undirected weighted network only with edges positively weighted is built by (4). The cutoff point *cor*
_{
thres(i, j)} is set at the higher level of the two 20th percentiles of the empirical edge weight distribution of stocks *i* and *j*. In other words, *cor*(*x*
_{
i
}, *x*
_{
j
}) is set at 0 when correlation *cor*(*x*
_{
i
}, *x*
_{
j
}) is lower than the lower 20th percentile of *cor*(*x*
_{
i
}, ·) or that of *cor*(·,*x*
_{
j
}). The threshold level is sufficiently high to exclude correlation values that are not statistically meaningful from our dataset. It is confirmed that the clustering result has not been much affected by thresholding at lower levels.

### Dataset for the empirical analysis

As mentioned in “Introduction”, we focus on Japanese stock return data as an empirical case study. The dataset used covers the stocks listed on the First Section of the TSE: 1,324 stocks in 33 business sectors. Note that stocks with low liquidity are excluded from the dataset. The observation period runs from January 2008 to May 2016 (2,058 trading days). The study period includes major two financial crises: the Lehman collapse (2008) and the Great East Japan Earthquake (2011). Stock returns are calculated by using daily closing data as log returns.

We fit the GARCH model, expressed by (1), (2), and (3), to those individual stock return data to calculate correlation matrix R from the filtered returns. Then, the static correlation network of individual stock returns is created by the adjacency conversion, as shown in (4).

### Grouping by recursive network division

The network built in the last section is too large to carry out correlation analysis; more than 1,300 nodes are densely connected with many other nodes. Hence, we need to conduct the first-round dimensionality reduction of the correlation network as mentioned earlier. The whole stock market is regarded as a market portfolio in which every stock is included. This portfolio can be separated into several sub-portfolios; then, the correlation structure of the whole market is approximated by the correlations among sub-portfolios. What is important here is how to organize a grouping of stock returns. The most frequently used approach for grouping stock returns is to adopt a predefined industrial sector classification. The business industry classification is adopted in the TSE; however, the sector classification is not necessarily consistent with the observed correlation structure of stock returns. Furthermore, such a sector classification tends to be significantly unbalanced in size, as discussed by Isogai (2014).

*Q*, proposed by Girvan and Newman (2002) and Newman (2006), of A is defined as

where *w*
_{
i
},*w*
_{
j
} are the sum of the weights of stock *i*, *j*; *δ*() takes 1 if both stocks are in the same class (*C*
_{
i
}=*C*
_{
j
}), otherwise 0; and *Q* takes the value between -1 and 1 with positive values indicating the possible presence of some group structure. We employ the simplest definition from among the many variants of modularity definitions. Once the first level of the division is completed by the modularity optimization, the same algorithm is applied to the generated groups of stock returns to make further divisions recursively. As for the stopping rule of recursive division, standard deviations of edge weights in individual groups are monitored to determine the number of groups; the group division process is controlled so as to avoid any significant heterogeneity in terms of group size that can cause heavy concentration of stocks in specific groups. A more detailed explanation of the recursive network division algorithm is described in Isogai (2014).

Finally, the hierarchical group structure is identified as having 14 unit groups (marked as squares with labels P1, P2, …, and P14), as shown in Fig. 1. The circles in Fig. 1 indicate the whole market and intermediate groups created in the layers in-between. Two major categories, termed Cyclical and Defensive, are created at the first division of the entire network. These two categories are used as the broadest categorization of the Japanese stock market for the following comparative analysis. The groups labeled P1, P2, …, and P14 are the unit group portfolios used to approximate the correlation structure of the whole stock market. Each group includes stocks of different sizes, which are categorized as either Cyclical or Defensive.

Clustering result with group features

Beta | ||||||
---|---|---|---|---|---|---|

Group id | Number of stocks | (share, %) | TOPIX | Exchange rate | Company size index | Overseas sales ratio |

Cyclical | 728 | (55.0) | 1.01 | 1.08 | 62.9 | 50.3 |

P1 | 141 | (10.6) | 0.93 | 0.93 | 35.3 | 47.1 |

P2 | 181 | (13.7) | 0.87 | 0.92 | 35.4 | 36.6 |

P3 | 62 | (4.7) | 1.01 | 1.16 | 63.5 | 34.6 |

P4 | 132 | (10.0) | 0.90 | 0.95 | 43.0 | 42.9 |

P5 | 54 | (4.1) | 1.08 | 1.19 | 72.9 | 61.6 |

P6 | 62 | (4.7) | 1.12 | 1.15 | 81.3 | 66.5 |

P7 | 52 | (3.9) | 1.05 | 1.10 | 79.1 | 50.8 |

P8 | 44 | (3.3) | 1.11 | 1.21 | 92.6 | 61.9 |

Defensive | 596 | (45.0) | 0.74 | 0.75 | 59.1 | 29.1 |

P9 | 164 | (12.4) | 0.90 | 0.99 | 64.6 | 38.5 |

P10 | 75 | (5.7) | 0.61 | 0.60 | 27.3 | 30.0 |

P11 | 92 | (6.9) | 0.70 | 0.72 | 40.8 | 27.1 |

P12 | 118 | (8.9) | 0.83 | 0.89 | 81.5 | 31.5 |

P13 | 62 | (4.7) | 0.73 | 0.71 | 75.6 | 25.5 |

P14 | 85 | (6.4) | 0.69 | 0.62 | 64.8 | 21.9 |

Business sector breakdown of identified groups (top three sectors)

Sector breakdown (share %) | |||||||
---|---|---|---|---|---|---|---|

Group id | (a) | (b) | (c) | (a+b+c) | |||

Cyclical | |||||||

P1 | Electric appliances | 17 | Services | 12 | Machinery | 10 | 39 |

P2 | Construction | 18 | Machinery | 13 | Wholesale trade | 10 | 41 |

P3 | Securities | 21 | Other financial business | 13 | Real estate | 11 | 45 |

P4 | Electric appliances | 20 | Chemicals | 18 | Wholesale trade | 16 | 54 |

P5 | Transportation equipment | 39 | Electric appliances | 20 | Machinery | 11 | 70 |

P6 | Electric appliances | 47 | Machinery | 23 | Chemicals | 10 | 80 |

P7 | Chemicals | 19 | Iron and steel | 17 | Nonferrous metals | 13 | 49 |

P8 | Electric appliances | 30 | Transportation equipment | 18 | Chemicals | 9 | 57 |

Defensive | |||||||

P9 | Banks | 26 | Construction | 11 | Chemicals | 9 | 46 |

P10 | Retail trade | 19 | Information and communication | 17 | Wholesale trade | 17 | 53 |

P11 | Retail trade | 17 | Wholesale trade | 16 | Foods | 8 | 41 |

P12 | Information and communication | 17 | Foods | 15 | Retail trade | 14 | 46 |

P13 | Electric power and gas | 26 | Pharmaceutical | 16 | Foods | 15 | 57 |

P14 | Retail trade | 53 | Services | 19 | Information and communication | 8 | 80 |

Next, the sub-portfolios based on the classification of those Cyclical and Defensive groups are created and averaged price index returns are calculated for each group. More specifically, the stock price of a sub-portfolio is first indexed as 100 at the beginning of the observation period; then, the mean value of the sub-portfolio is calculated with an equal weight placed on each stock. The portfolio return is defined as the log return of the mean value of the portfolio as in the case of individual stock returns. The market portfolio including more than 1,300 stocks is now summarized into only 14 sub-portfolios by using the group definition provided by network clustering.

Thus, the first dimensionality reduction in terms of the number of stocks is completed. Stocks connected with thicker edges (higher correlations) are grouped by network clustering with the modularity optimization; then, the log returns of those group-based sub-portfolios are calculated. Now, we can proceed onto the next stage. In this stage, we extend the static correlation model to a dynamic one followed by the second-round dimensionality reduction of the correlation structure between the group portfolios along the time axis in order to carry out the intertemporal comparative analysis.

## Dynamic changes in correlation networks

In the previous section, we assumed that there exists a static correlation network in which the edges between nodes (stocks) do not change during the observation period. Such an assumption is introduced partly because of technical reasons regarding the estimation of the correlation matrix of stock returns. It is rather an empirical issue whether the correlation structure is stable or dynamically changing over time. The linkages between the nodes may change because of changes in the correlation of stock returns, although it is not technically easy to detect such dynamic changes from observed price data. Normally, a sample pairwise correlation of returns is calculated based on the observed filtered or unfiltered return data; therefore, only one sample correlation, the static one, is available for one data period. However, it would be meaningful to know how the correlation network changes over time from the viewpoint of investment decision making as well as portfolio risk control if we could establish a method of estimating dynamic correlations.

In this context, Isogai (2016) proposed a novel approach to build a dynamic correlation network of returns. We adopt a model-based correlation instead of using a filtered sample correlation to calculate the correlations and adjacency matrices for network clustering. This model-based correlation matrix can be estimated for each trading day during the observation period and a dynamic correlation network built accordingly. A series of adjacency matrices are calculated by adjacency conversion from the estimated correlation matrices. The dimensionality issue due to the large number of adjacency matrices needs to be addressed to allow an intertemporal comparison of the network. We overcome this issue by using the second-round dimensionality reduction of the adjacency matrices.

### Model-based dynamic correlation

The most frequently used approach to measure the dynamic correlations of asset returns is to calculate the correlations of returns over moving windows. A series of correlation matrices can be built by using a rolling observation time window during the study period. This method, however, has some drawbacks as discussed in Isogai (2016). Hence, a statistical model-based correlation can be used to describe the dynamic correlation network. The DCC model originally proposed by Engle (2002) was developed in the context of using multivariate volatility models in financial econometrics. The static unconditional correlation matrix R of return r
_{
t
} is calculated as the correlation matrix of filtered return z
_{
t
} defined in (1) when building the static correlation network. When applying the DCC model, a time-dependent correlation matrix R
_{
t
} of return r
_{
t
} is estimated instead of R. This means that the correlation of returns can change dynamically during the observation period; therefore, dynamic changes in the structure of a correlation network are represented by a dynamic conditional adjacency matrix A
_{
t
}. There are multiple adjacency matrices, as many as the number of trading days converted from a series of R
_{
t
} during the period.

*N*×

*N*positive-definite dynamic correlation R

_{ t }is introduced to model the dependency structure of r

_{ t }. The time dependency of R

_{ t }is described by using a proxy variable Q

_{ t }, which is introduced to ensure the positive-definiteness of R

_{ t }as

*a*

_{ i }and

*b*

_{ j }are non-negative scalars and \(\boldsymbol {\bar {Q}}_{t}\) is the unconditional mean of Q

_{ t }. The DCC model with time lags in the conditional correlation is denoted as DCC (

*m, n*). The parameter

*a*

_{ i }shows the sensitivity of Q

_{ t }to previous shocks, while the parameter

*b*

_{ j }represents the persistence of the correlation in previous periods. The correlation matrix R

_{ t }is calculated by rescaling Q

_{ t }as

For more details on the DCC–GARCH model, see Engle and Sheppard (2001) and Engle (2002).

### Model fitting and adjacency conversion

The parameters of the DCC model are estimated by using MLEs with the Japanese stock return data. We employ a two-stage fitting of the DCC model: the first stage of the GARCH model fitting followed by the second stage of the DCC parameter estimation. The GARCH model is fitted to the sub-portfolio returns just as it is fitted to individual stock returns in “Correlation network of financial asset returns”. Once the first stage of the model estimation is completed, the filtered residuals z
_{
t
} of 14 sub-portfolios are calculated by using the estimated parameters of the GARCH model. In the second stage, the DCC model parameters used to calculate the dynamic correlation R
_{
t
} of the sub-portfolio returns are estimated.

_{ t }should be explicitly defined in advance. The distribution of z

_{ t }is assumed to be one of the normal, Student

*t*, or skew

*t*distributions again. The joint density function

*f*(r

_{ t }) is then defined by using a copula density function that determines the dependency between the sub-portfolio returns. The copula function plays a key role in addressing the dependency between the heterogeneous distributions of z

_{ t }as mentioned above. In general, the joint density function

*f*(

*x*) of a vector of variable X=(

*X*

_{1},…,

*X*

_{ N }) can be described using a copula function as follows:

*f*

_{ i }(

*x*

_{ i }) is the marginal distribution of

*x*

_{ i },

*c*(·) is the density function of the copula, and

*F*(·) is the joint distribution function of X. We choose the Student

*t*-copula that can handle tail dependency, which takes two parameters: conditional correlation R

_{ t }and the constant shape parameter. For technical details on the copula and Student

*t*-copula, see Sklar (1959) and Demarta and McNeil (2005). Thus, the joint density function of r

_{ t }is defined as a combination of the copula density and density of the i.i.d. residual z

_{ t }:

where *u*
_{
i·t
}=*F*
_{
i
}(*r*
_{
i·t
}|*μ*
_{
i·t
},*h*
_{
i·t
},*θ*
_{
i
}); *θ*
_{
i
} is a parameter set including the ARMA–GARCH parameters in (2) and (3) and the distributional parameters of *z*
_{
i
}; \(\phantom {\dot {i}\!}c^{S_{t}}(\cdot)\) is the Student *t*-copula density function; and *η* is the shape parameter of the Student *t*-copula. The conditional correlation R
_{
t
} is defined as one parameter of the copula function, the time-dependent structure of which is described in (6) and (7) in the DCC model setting. The estimate of R
_{
t
} therefore collapses to the non-negative scalars (a, b) defined in (6).

We need to determine the distribution of z
_{
t
} as well as the DCC order (*m, n*): the model selection is made by comparing the goodness-of-fit measure, namely the Akaike information criterion (AIC), from the multiple combinations of the model settings. The log-likelihood function *LL*(θ|r
_{
t
}) built by using (10) comprises two parts: the copula part with the DCC parameters (a, b) and marginal distribution part *f*
_{
i·t
}(*z*
_{
i·t
}|*θ*
_{
i
}). The two parts of the log-likelihood function can be maximized independently: first, the individual distributional parameter set *θ*
_{
i
} is estimated, followed by the DCC parameters (a, b). More technical details about the model fitting procedures are described in Isogai (2016); Patton (2006), and Joe (2005).

*m, n*) is (1, 2);

*a*

_{1}, the sensitivity of the correlation to previous shocks, takes a small value. The larger value of

*b*

_{1}+

*b*

_{2}means that the dynamic correlation matrix R

_{ t }is more dependent on its past values than previous shocks, since the parameter

*b*

_{ j }represents the degree of persistence of the correlation. The model parameter restriction shown in (8) is confirmed to be satisfied. The shape parameter of the Student-

*t*copula is not so low, meaning that the degree of tail dependency seems to be limited after volatility filtering. The other details of the estimation results including the ARMA–GARCH model of individual returns are omitted because of space limitations.

DCC parameter estimation results

m, n | a1 | b1 | b2 | [b1+b2] |
| |
---|---|---|---|---|---|---|

Estimate | 1, 2 | 0.0259 | 0.5549 | 0.3855 | [0.9404] | 13.0 |

( | (0.0000) | (0.0000) | (0.0000) | (0.0000) |

*m, n*) is (1, 2) for all sub-portfolios in Cyclical, while it is (1, 2) or (1, 1) in Defensive. The combination of lower

*a*

_{1}and higher

*b*

_{1}+

*b*2 values appears to be common to every portfolio, while the relative share of the two types of parameters varies over the sub-portfolios. Again, the shape parameter of the Student-

*t*copula is higher in every sub-portfolio. This result means that the dynamic correlation properties are similar, although small differences exist between the sub-portfolios. Further, we can extend our dynamic correlation network analysis to those sub-portfolios when required.

DCC parameter estimation results by sub-portfolio

m, n | a1 | b1 | b2 | [b1+b2] |
| ||
---|---|---|---|---|---|---|---|

Cyclical | P1 | 1, 2 | 0.0080 | 0.5537 | 0.3523 | [0.9060] | 30.5 |

P2 | 1, 2 | 0.0074 | 0.5567 | 0.3792 | [0.9358] | 19.7 | |

P3 | 1, 2 | 0.0093 | 0.3275 | 0.3897 | [0.7172] | 29.9 | |

P4 | 1, 2 | 0.0079 | 0.5476 | 0.3219 | [0.8695] | 31.6 | |

P5 | 1, 2 | 0.0064 | 0.5787 | 0.3240 | [0.9027] | 27.2 | |

P6 | 1, 2 | 0.0086 | 0.5820 | 0.3219 | [0.9038] | 22.3 | |

P7 | 1, 2 | 0.0079 | 0.5432 | 0.3713 | [0.9145] | 25.6 | |

P8 | 1, 2 | 0.0069 | 0.5542 | 0.3925 | [0.9467] | 20.9 | |

Defensive | P9 | 1, 2 | 0.0078 | 0.4651 | 0.3815 | [0.8467] | 27.9 |

P10 | 1, 2 | 0.0103 | 0.2498 | 0.3980 | [0.6478] | 30.9 | |

P11 | 1, 2 | 0.0070 | 0.4963 | 0.3890 | [0.8853] | 30.9 | |

P12 | 1, 1 | 0.0048 | 0.9400 | - | [0.9400] | 22.9 | |

P13 | 1, 2 | 0.0094 | 0.2627 | 0.5084 | [0.7711] | 27.1 | |

P14 | 1, 1 | 0.0042 | 0.8945 | - | [0.8945] | 38.5 |

_{ t }can be easily calculated from (6). Then, the estimated model-based conditional correlation matrix R

_{ t }is converted into the conditional adjacency matrix of the dynamic correlation network. Here, we use the same unsigned nondecreasing adjacency conversion as the one in (4) used when building the static correlation network for clustering stock returns in “Correlation network of financial asset returns”. The adjacency conversion is extended to a time-dependent conditional one as follows:

where A
_{
ij,t
} is the (*i, j*)_{
th
} entry of the conditional weighted adjacency matrix A
_{
t
} and *cor*(*x*
_{
i
}, *x*
_{
j
})_{
t
} is the (*i, j*)_{
th
} entry of the dynamic correlation matrix R
_{
t
}. The diagonal element of A
_{
ij,t
} is 0. The threshold value \(\phantom {\dot {i}\!}{cor}_{{thres}({i, j})_{t}}\) is determined in the same way as in (4) at every point in time. Thus, the dynamic correlation network is created with the adjacency matrices A
_{
t
} available for each trading day. One technical issue is that thresholding of the adjacency matrix entries can affect the result of our intertemporal analysis described below. The thresholding method is time-dependent; therefore, threshold values of the same matrix entry can change dynamically. It may cause discontinuous changes, especially when the threshold level is higher. These aspects of our dynamic thresholding method makes it difficult to forecast its effect on intertemporal analysis. We confirmed that analytical results are stable with thresholding in a few different settings; however, this point is still an important caveat of this study.

## Comparative analysis of the dynamic correlation network

In the previous section, a dynamic correlation network of stock returns was successfully built and conditional adjacency matrices A
_{
t
} were identified for every trading day. We discuss how to implement the second-round dimensionality reduction of A
_{
t
} in this section. The dynamic correlation network represents the time-varying pairwise correlations between the index returns of 14 sub-portfolios. The nodes of the network are those sub-portfolios generated by the network clustering of the overall static correlation network that includes every stock as a node. The index return of each sub-portfolio can be regarded as a factor that jointly determines the whole market movement. Hence, the relationships (edges) between factors (nodes) describe the time-varying relationships between individual stock returns that belong to different sub-portfolios in a reduced dimension: from 1,324 stocks to 14 return indices. The dynamic correlation network carries summary information of the correlation structure of stock returns in the form of conditional adjacency matrix A
_{
t
}.

In addition to the above-mentioned first-round dimensionality reduction of the number of nodes, we need to reduce the number of adjacency matrices, since it is difficult to compare the adjacency matrices of 2,058 trading days directly. In this context, the second-round dimensionality reduction is introduced by clustering the conditional adjacency matrices. Specifically, we use the subspace clustering of matrices by using the low rank tensor approximation method. Once the adjacency matrices are sorted into a small number of groups, we build reduced-size sub-period adjacency matrices to summarize information on the inter-period changes of the dynamic network.

### Clustering the dynamic adjacency matrices

When clustering adjacency matrices, the conversion of an adjacency matrix into a feature vector needs to be implemented first to apply any clustering algorithm. Some approximation of the original features is often adopted for data compression as well as noise reduction. For this purpose, tensor decomposition is useful with the use of a clustering algorithm such as *k*-means. A tensor is represented as a multidimensional array relative to a choice of the basis of the particular space on which it is defined (for details, see Kroonenberg (2008)). Intuitively, a tensor is a higher-order generalization of vectors and matrices. The conditional adjacency matrices can thus be regarded as a tensor of order three; an adjacency matrix can be similarly regarded as an order two tensor. Principal component analysis (PCA) is often combined with a clustering algorithm when the target data are arranged in a vector form. Eigenvalue decomposition or singular value decomposition (SVD) is then used to decompose a stacked data matrix into several factors. The idea of such decomposition can be extended to tensor-based factor decomposition.

Suppose we have an order three tensor X that is equivalent to a time series of conditional adjacency matrix A
_{
t
}. In the tensor-based decomposition, the tensor is represented as the product of some components in the same way as in PCA or SVD. Several tensor decomposition methods have been proposed including canonical polyadic decomposition (CP) as described in Carroll and Chang (1970) and Bro (1997), Tucker decomposition proposed by Tucker (1966), and higher-order SVD (HOSVD) by Lathauwer et al. (2000). We use the Tucker decomposition method to group the adjacency matrices, which offers a flexible choice of lower rank decomposition. For more detailed information about low rank tensor approximation, see Kolda and Bader (2009) and Grasedyck et al. (2013).

*k*=1, 2, and 3) and core tensor \(\boldsymbol {\mathcal {G}}\in \boldsymbol {\mathcal {R}}^{r_{1}\times r_{2}\times r_{3}}\) as

where *n*
_{1} and *n*
_{2} correspond to the size of conditional adjacency matrix A
_{
t
}; *n*
_{3} is the length of observation period (the number of trading days); ×
_{1}, ×
_{2}, and ×
_{3} are tensor products in the corresponding mode; *r*
_{1},*r*
_{2}, and *r*
_{3} are lower ranks for the approximation in each direction. In our dataset, *n*
_{1}(= *n*
_{2}) is 14; *n*
_{3} is 2,058.

The rank order selection of (*r*
_{1},*r*
_{2},*r*
_{3}) is flexible in the Tucker decomposition. We set *r*
_{1}=*r*
_{2}=3 for the further dimension reduction of the conditional adjacency matrix, while we set *r*
_{3}=*n*
_{3}=2,058 to preserve information about changes along the time axis as much as possible for time series clustering. Thus, the time series of adjacency matrices are decomposed into three orthogonal unit factors and one core tensor.

_{3}, which is the orthogonal basis for the time horizon, since we are mainly interested in the differences between trading days. The projection of X onto the subspace is defined as

Y is then transformed into a vector used as the feature vector for clustering by *k*-means. As for the choice of *k*, we calculate the gap statistics (Tibshirani et al. 2001) for a different number of clusters (*k*) to find the best value. The gap statistics analysis indicates that the best *k* is around 4; however, we select *k*=3 to simplify the intertemporal comparison.

The time series of conditional adjacency matrix A
_{
t
} are categorized into three groups in terms of trading date by *k*-means clustering with low rank tensor decomposition. The clustering result means that the dynamically changing correlation network is classified into three types. In other words, the whole observation period can be divided into three sub-periods, in which the network has a different correlation structure. The time series of the adjacency matrices are projected onto only the three representative adjacency matrices in a way that minor differences between them are discarded to highlight the major differences. The second-round dimensionality reduction is thus achieved, allowing us to make an intertemporal comparison of the correlation structure.

### Classification of dynamic correlation networks

Largest eigenvalue index by sub-period

Mean | Max | Min | ||||
---|---|---|---|---|---|---|

Index | (Index-100) | Index | (Index-100) | Index | (Index-100) | |

T1 | 98.56 | (-1.44) | 96.73 | (-3.27) | 101.16 | (+1.16) |

T2 | 98.83 | (-1.17) | 97.14 | (-2.86) | 100.00 | (0.00) |

T3 | 102.14 | (+2.14) | 100.00 | (0.00) | 104.18 | (+4.18) |

Whole period | 100 | 100 | 100 | |||

[value] | [8.90] | [9.64] | [8.19] |

Topological features of the correlation network by sub-period

Density | Centralization | Heterogeneity | ||||

T1 | 0.625 | 0.149 | 0.273 | |||

T2 | 0.614 | 0.158 | 0.313 | |||

T3 | 0.638 | 0.147 | 0.304 | |||

Whole period | 0.627 | 0.151 | 0.282 | |||

Index | (Index-100) | Index | (Index-100) | Index | (Index-100) | |

T1 | 99.66 | (-0.34) | 99.09 | (-0.91) | 96.65 | (-3.35) |

T2 | 97.96 | (-2.04) | 104.76 | (+4.76) | 110.78 | (+10.78) |

T3 | 101.75 | (+1.75) | 97.47 | (-2.53) | 107.52 | (+7.52) |

Whole period | 100 | 100 | 100 |

Intuitively, the largest eigenvalue of an adjacency matrix represents the strength of the correlation in the network. Higher levels of the largest eigenvalues are observed during the crisis periods including the Lehman collapse (2008) and the Great East Japan Earthquake (2011), which means a stronger linkage between nodes. The levels of the largest eigenvalues seem to be somewhat related to the sub-period type.

Table 5 shows the relative changes to the largest eigenvalue, which is indexed at the whole period=100. The mean values of the sub-periods show that T3 has a higher level of the largest eigenvalue compared with T1 and T2. The maximum value during the whole period exists in T3 as shown by the 100 value in the column Max, while the minimum value exists in the column Min. The comparison of the largest eigenvalue confirms that T3 is a stress period, while T1 and T2 are normal periods.

_{ t }are density (

*D*), centralization (

*C*), and heterogeneity (

*H*). The three measures are defined as

where *n* is the number of nodes; k
_{
t
} is a vector of the node degree (connectivity) defined as the sum of the row or column of an adjacency matrix; and *max*(·),*mean*(·), and *var*(·) are the maximum, mean, and variance function, respectively. For more details on these topological measures, see Horvath (2011). Density *D*, defined as the mean of the off-diagonal elements of A
_{
t
}, measures the overall connection (correlation) among nodes: a density close to 1 indicates that all nodes are strongly correlated with each other. Centralization *C* is 1 when one node has fully connected edges with all other nodes that are not connected with each other; 0 when each node has the same connectivity. Heterogeneity *H*, defined as the coefficient of variation of the connectivity distribution, measures the variation in connectivity across nodes. These three topological measures are calculated for conditional adjacent matrix A
_{
t
} for every trading day during the observation period and then summarized as mean values and indices in Table 6.

The comparison of the network topological measure indices between sub-periods in Table 6 provides further information to understand the characteristics of the three sub-periods. Stressed period T3 has a higher level of density and heterogeneity, but lower centralization. As for the two normal periods, T1 and T2 have a different combination of topological features. In T1, all three measures are at a lower level with no significant change from the whole period average. T2 has a lower level of density, but higher centralization and heterogeneity than the average. However, it is difficult to know any more from such a network topology comparison. T1 and T2 as well as those two and T3 have different topological features with regard to the dynamic correlation network, reconfirming that the three-period classification is sufficiently meaningful for further comparative analysis, although we need more data to delve into the detail.

### Intertemporal changes of the correlation networks

The division into three sub-periods described in the last section enables us to reduce the large dimension of conditional adjacency matrix A
_{
t
} from 2,058 (trading days) to only three. This dimensionality reduction simplifies the comparison of the dynamic correlation network on the time axis, allowing us to summarize the differences between so many networks in comparison with the three networks that represent the corresponding sub-period. Specifically, the adjacency matrices for each sub-period are simply averaged for each entry of the matrix to create an adjacency matrix that represents the corresponding sub-period. Finally, the dynamic correlation structure of the entire Japanese stock market is summarized by using only the three 14-by-14 adjacency matrices. We can therefore make pairwise comparisons of these three networks.

The changes between T3 and T1 or T2 represents a transition from (to) a normal period to (from) a stressed period. A higher level of correlation is observed during stressed periods as mentioned earlier. The changes shown by Figs. 14, 15, 16 and 17 indicate that there are positive and negative contributions of pairwise correlations between individual groups to the overall intensified correlation during a stressed period. It is also apparent that the pattern of changes to (from) a stressed period is greatly different, reflecting the significant difference of correlation structure between the two normal periods shown by Figs. 12 and 13.

Thus, the correlation network of Japanese stock returns is summarized into a correlation network of 14 sub-portfolio returns by using network clustering; then, the dynamic changes in the network are also summarized into only three static networks to facilitate an intertemporal comparison. Here, we summarize findings from comparative analysis of the three sub-period correlation networks. The correlation network appears to be largely stable over time, while an elevated level of overall correlation are observed during stressed periods (T3) compared with normal periods (T1 and T2). The pairwise comparisons between three sub-periods correlation networks reveal that changes in correlation are observed more clearly within the Defensive groups and between the Cyclical and Defensive groups, whereas changes in correlation within the Cyclical groups are rather limited. The result suggests that there is some fundamental difference in terms of changing pattern of network structure between the two major categories.

## Discussion and conclusion

The dynamically changing correlation network of individual financial returns has been recognized an important topic. There are many research works studied in this regard; however, many of them adopted static correlation measures calculated over a sample period. Intertemporal analysis was often based on a time series of such static correlations calculated by rolling sample periods, which may lead to a biased estimate of correlation as discussed in Isogai (2016). The large number of financial assets causes another technical difficulty when dealing with the correlation network, while the existing sector classification is not reliable for grouping of stocks. Thus, an efficient and reliable method for dimensionality reduction is required for an extended research of correlation network. Dimensionality reduction of a time series of conditional correlation network was another difficult issue for intertemporal comparative analysis. The proposed analytical framework of dynamic correlation network with non-sector based grouping of stock returns can handle these issues in a systematically organized way.

In this study, we proposed a new approach to enable an intertemporal analysis of a large-scale correlation network of financial asset returns in a compact way by combining our previous research works. The main contribution of this study is the provision of two types of dimensionality reduction methods: (i) the reduction of a large correlation network into a smaller factor correlation network and (ii) the reduction of a time series of a correlation network into a countable number of representative correlation networks. Such twofold dimensionality reduction works well to extract important information from complicated correlation networks.

The proposed method, however, is still at an early experimental stage and several issues must be addressed to enhance its efficiency and stability. Firstly, our method heavily depends on econometric time series models including GARCH and DCC models, which are greatly complicated; many simplified assumptions have been introduced in model building. We also need to select a model from many alternatives before undergoing the time-consuming parameter estimation process. Specifically, the DCC model has many technically difficult issues regarding the parameter estimation. As for network-building process, our simple adjacency conversion formula can be improved to enhance the signal–noise separation performance. In addition, many alternative options exist for the selection of the clustering algorithm. Noteworthily, the empirical results of the dynamic correlation network of Japanese stock returns depend on those simplified assumptions and these results may be affected by changing any of the model assumptions. Secondly, the empirical results reveal the need for supplementary analysis to clarify what causes intertemporal changes in the correlation network. The dynamic network analysis only provide an initial clue to identify when and how the network changes; we need more information to enhance our understanding of the meaning of such changes.

With regard to the possible practical application of our proposed method, it can be easily applied to portfolio optimization and risk control in financial investment decisions. In standard financial model settings, the static correlation of returns is normally assumed as one of the key inputs. Even if it is difficult to change the modeling framework fundamentally, our proposed method can thus provide important information about the dynamics of the correlation structure, which contributes to having some appropriate adjustments in the model application. For example, the degree of correlation between the Cyclical and Defensive group portfolios can change significantly between sub-periods as mentioned in the previous section. Such information is greatly helpful for making decisions on investment allocation as well as risk control since ordinary models do not consider such facts.

Lastly, our method can be extended to cover other financial and non-financial time series data with large-scale volatility fluctuations. Not only financial time series other than stock prices tend to have significant volatility fluctuations; therefore, there is a higher chance of applying our model. We may find a different dynamic correlation structure if our method is applied to those time series. Our method is also applicable to non-financial volatile time series, although the careful examination of such extended use of the method is required in advance.

In future research, we will aim to apply our method to non-Japanese stock returns to understand whether a similar result is obtained. The stock market is globally linked closely; we are greatly interested in the dynamic correlation network analysis of stock returns between multiple countries. Further, the dynamic correlation network analysis between different classes of financial assets is another interesting topic for future analysis.

## Declarations

### Acknowledgements

The views expressed here are solely those of the author and do not necessarily reflect those of the Bank of Japan. This work was supported by KAKENHI (16H07102).

### Availability of data and materials

We used stock price data for listed companies at TSE. The stock price data can be found at the stock data search website of TSE (http://quote.jpx.co.jp/jpx/template/quote.cgi?F=tmp/e_stock_search), although the whole period of time series data are not necessarily available there. The full set of time series data is available through a wide range of commercial database service, including Bloomberg (http://www.bloomberg.com). The author cannot provide the stock price data due to contractual limitations.

### Competing interests

The author declares no competing interests.

### Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Authors’ Affiliations

## References

- Bollerslev, T (1986) Generalized autoregressive conditional heteroskedasticity. J Econ 31(3): 307–327.MathSciNetView ArticleMATHGoogle Scholar
- Bro, R (1997) Parafac. tutorial and applications. Chemometr Intell Lab Syst 38(2): 149–171.View ArticleGoogle Scholar
- Carroll, JD, Chang J-J (1970) Analysis of individual differences in multidimensional scaling via an n-way generalization of “eckart-young” decomposition. Psychometrika 35(3): 283–319.View ArticleMATHGoogle Scholar
- Chi K, T, Liu J, Lau FC (2010) A network perspective of the stock market. J Empir Financ 17(4): 659–667.View ArticleGoogle Scholar
- Cont, R (2007) Volatility clustering in financial markets: Empirical facts and agent-based models In: Long Memory in Economics, 289–309.. Springer, Berlin, Heidelberg.View ArticleGoogle Scholar
- Demarta, S, McNeil AJ (2005) The t copula and related copulas. Int Stat Rev 73(1): 111–129.View ArticleMATHGoogle Scholar
- Engle, R (2002) Dynamic conditional correlation: A simple class of multivariate generalized autoregressive conditional heteroskedasticity models. J Bus Econ Stat 20(3): 339–350.MathSciNetView ArticleGoogle Scholar
- Engle, R, Sheppard K (2001) Theoretical and empirical properties of dynamic conditional correlation multivariate GARCH. Nat Bur Econ Resw8554: 1–46. http://www.nber.org/papers/w8554.Google Scholar
- Ghalanos, A (2014) rmgarch: Multivariate GARCH Models. R package version 1.3-0. http://cran.r-project.org/web/packages/rmgarch/index.html. Accessed 09 Mar 2016.
- Girvan, M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99(12): 7821–6.ADSMathSciNetView ArticleMATHGoogle Scholar
- Grasedyck, L, Kressner D, Tobler C (2013) A literature survey of low-rank tensor approximation techniques. GAMM-Mitteilungen 36(1): 53–78.MathSciNetView ArticleMATHGoogle Scholar
- Horvath, S (2011) Weighted Network Analysis: Applications in Genomics and Systems Biology. Springer, New York.View ArticleGoogle Scholar
- Isogai, T (2014) Clustering of Japanese stock returns by recursive modularity optimization for efficient portfolio diversification. J Complex Netw 2(4): 557–584.View ArticleGoogle Scholar
- Isogai, T (2016) Building a dynamic correlation network for fat-tailed financial asset returns. Appl Netw Sci 1(1): 1–24.View ArticleGoogle Scholar
- Joe, H (2005) Asymptotic efficiency of the two-stage estimation method for copula-based models. J Multivar Anal 94(2): 401–419.MathSciNetView ArticleMATHGoogle Scholar
- Kenett, DY, Huang X, Vodenska I, Havlin S, Stanley HE (2015) Partial correlation analysis: Applications for financial markets. Quant Finan 15(4): 569–578.MathSciNetView ArticleGoogle Scholar
- Kolda, TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51(3): 455–500.ADSMathSciNetView ArticleMATHGoogle Scholar
- Kroonenberg, PM (2008) Applied Multiway Data Analysis. John Wiley & Sons, New Jersey.View ArticleMATHGoogle Scholar
- Lathauwer, LD, Moor BD, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21(4): 1253–1278.MathSciNetView ArticleMATHGoogle Scholar
- Mandelbrot, BB (1963) The variation of certain speculative prices. J Bus 36(4): 394–419.View ArticleGoogle Scholar
- Mantegna, RN (1999) Hierarchical structure in financial markets. Eur Phys J B-Condens Matter Compl Syst 11(1): 193–197.View ArticleGoogle Scholar
- Newman, MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci USA 103(23): 8577–82.ADSView ArticleGoogle Scholar
- Onnela, J-P, Chakraborti A, Kaski K, Kertesz J, Kanto A (2003) Asset trees and asset graphs in financial markets. Phys ScrT106: 48–54.ADSView ArticleMATHGoogle Scholar
- Patton, AJ (2006) Modelling asymmetric exchange rate dependence. Int Econ Rev 47(2): 527–556.MathSciNetView ArticleGoogle Scholar
- Preis, T, Kenett DY, Stanley HE, Helbing D, Ben-Jacob E (2012) Quantifying the behavior of stock correlations under market stress. Sci Rep 2(id.752): 1–5.Google Scholar
- Sklar, M (1959) Fonctions de répartition à n dimensions et leurs marges In: Publ. Inst. Stat. 8, 229–231.. Université Paris, Paris.Google Scholar
- Tibshirani, R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc Ser B Stat Methodol 63(2): 411–423.MathSciNetView ArticleMATHGoogle Scholar
- Tucker, LR (1966) Some mathematical notes on three-mode factor analysis. Psychometrika 31(3): 279–311.MathSciNetView ArticleGoogle Scholar
- Tumminello, M, Lillo F, Mantegna RN (2010) Correlation, hierarchies, and networks in financial markets. J Econ Behav Organ 75(1): 40–58.View ArticleGoogle Scholar