Skip to main content

Portfolio diversification, differentiation and the robustness of holdings networks


Networks of portfolio holdings exemplify how interdependence both between the agents and their assets can be a source of systemic vulnerability. We study a real-world holdings network and compare it with various alternative scenarios from randomization and rebalancing of the original investments. Scenarios generation relies on algorithms that satisfy the global constraints imposed by the numbers of outstanding shares in the market. We consider fixed-diversification models and diversification-maximizing replicas too. We extensively analyze the interplay between portfolio diversification and differentiation, and how the outreach of exogenous shocks depends on these factors as well as on the type of shock and the size of the network with respect to the market. We find that real portfolios are poorly diversified but highly similar, that portfolio similarity correlates with systemic fragility and that rebalancing can come with an increased similarity depending on the initial network configuration. We show that a large diversification gain is achieved through rebalancing but, noteworthy, that makes the network vulnerable in front of unselective shocks. Also, while the network is riskier in the presence of targeted shocks, it is safer than its random counterparts when it is stressed by widespread price downturns.


When we consider large institutional investors like mutual funds, interconnections arise because of managerial sharing (Augustiani et al. 2015), herding behavior (especially during crises), or simply similarity of investment strategies. Asset-overlap at the global scale exists because of the activity of global and international funds that invest in foreign assets around the world. The effects on systemic riskiness of such interdependence have been emphasized by the Global Financial Crisis 2007–2008. A very active area of research deals with the problem of making quantitative statements about the fragility of financial systems with respect to the propagation of distress. The latter can be negative downturns in asset values or insolvency of financial institutions depending on cases. Network science has provided insights into this topic and, in particular, the role of network topology characterizing mutual ties and exposures has been investigated (Hurd and Rohwedder 2010; Battiston et al. 2012b; 2012a; Huang et al. 2013; Delpini et al. 2013; Galbiati et al. 2013; Caccioli et al. 2014; Acemoglu et al. 2015; Elliott et al. 2014).

A cornerstone of portfolio management is diversification (Statman 1987; Agnew et al. 2003; Domian et al. 2007; Hu et al. 2014), a well-understood strategy to cut down portfolio risk from idiosyncratic shocks to asset prices. A reduction of individual risk can be achieved by reducing portfolio concentration (Domian et al. 2003; 2007; Statman 2004) but the evidences collected from the crisis suggest that the efficacy of diversification strategies can depend strongly on market conditions. More generally, the systemic implications of diversification and its relationship with the notion of differentiation between the holdings of different market players are uncertain. In Delpini et al. (2019) the network of mutual funds portfolios in the United States has been studied across the crisis. It has been found that diversification has increased and, less predictably, investment similarities have decreased. Simulations show that the observed similarities between portfolios are more likely than one can expect by chance and finite-size effects, even when controlling for strongly connected assets, and finally there exist groups of highly similar portfolios. This makes the network riskier with respect to random counterparts to parity of portfolio diversification. Accordingly, the analysis in Fricke (2019) suggests that, while diversification may reduce risk for an individual portfolio, the structure of the similarities across mutual funds could be key in determining systemic risk.

In this paper we study bipartite networks of portfolio holdings (Delpini et al. 2019; Lin and Guo 2019; Braverman and Minca 2018; Guo et al. 2016). We model a holdings network as sub-system of the embedding financial market wherein the numbers of outstanding shares of each stock are treated as constants and provide global market constraints. We aim at analyzing how the outreach of financial shocks in the holdings network depends on network topology, portfolio diversification and differentiation, as well as on the type of shock applied and the relative size of the network. In particular, we want to test if more diversified portfolios can actually reduce risk at the systemic level as one might expect. In order to generate random scenarios for comparisons, we take advantage of algorithms that are required to satisfy the global constraints at any time. Such algorithms mimic the actual process of increasing, reducing or replacing asset positions through buy/sell orders to the market. In particular, we consider explicitly scenarios where holders reallocate their wealth to either the original or randomly chosen assets in a way to achieve the most diversified portfolio compatible with the constraints in force. We develop the analysis in a computational setting, exploiting a snapshot of the US mutual funds holdings network as a test case. We extensively compare the original network with the simulated scenarios and try to better characterize the interplay between portfolio diversification and differentiation. We then simulate the process of propagation of exogenous shocks to asset prices, based on a dominating flow-induced trading dynamics wherein negative fund performances are followed by asset fire-sales (Greenwood et al. 2015; Fricke 2019). We do that for both targeted and unselective shocks and compare the systemic damages registered for each scenario.

In “Model of bipartite network of holdings” section we formalize the notion of a holdings network within the market and its representation as a bipartite graph. In “Market model of flow-induced trading” section we describe the dynamics of shock propagation that will be used in the simulations. “Algorithms” section describes the basic concepts underlying the algorithms for generating rebalanced portfolios or performing holdings reshuffle. The statistical properties of the generated scenarios, their comparison in terms of diversification, differentiation and riskiness in front of different kinds of shocks are presented in “Results” section. We draw conclusions afterwards.

Materials and methods


As a case study for our analysis, we employ a snapshot of the bipartite network of US mutual fund holdings corresponding to the third quarter of 2012. The network was inferred from data in the Survivor-Bias-Free US Mutual Funds database provided by the Center for Research in Security Prices, The University of Chicago Booth School of Business. We only considered equity funds with a reported Total Net Assets (TNA) greater than or equal to one million US dollars (USD) and, in addition, we discarded funds that are classified as International and Global. We parsed the data for just the holdings that correspond to stocks with a valid ticker. Assets with coupon or maturity information and derivatives were not considered. This filtered network has 3497 different funds investing in 9015 different assets. The number of holdings is 550,554 for a total asset value of 2.61 trillion USD. In the following, we will use the generic term “holders” to refer, in particular, to US mutual funds and their asset managers, and “holdings” to indicate the different stocks in the funds’ investment portfolios.

Model of bipartite network of holdings

We model a network of holdings with m portfolios and an investment universe of n different stocks as a bipartite graph (Caldarelli 2007; Newman 2010). We first introduce the network’s incidence matrix B=(bij), where bij=1 if portfolio i invests in stock j and bij=0 otherwise. The number of holdings is the number of the non-zero elements of B and we will indicate it as nh=|E|, where E={(i,j)} stands for the set of the holdings and |E| for its cardinality. Let nij be the number of shares of stock j in the portfolio and sj the stock price. The share matrixN=(nij) can be assumed as the weighted incidence matrix.

The TNA of portfolio i is \(p_{i}=\sum _{j=1}^{n} n_{ij}\,s_{j} = (\mathbf {N}\,\mathbf {s})_{i}\) where \(\mathbf {s}=(s_{1}, s_{2}, \dots, s_{n})\) is the vector of stock prices. By introducing the value matrixV=(vij) whose elements vij=nijsj are the holding values, the TNA can be written equivalently as pi=(V1n)i where 1n is the vector with all n components being one. In the following we will also refer to the portfolio weightwij=vij/pi of each position, representing the relative weight of an individual position into a stock with respect to the portfolio’s TNA.

Such network is a subsystem of the whole “market”. The number Nj of outstanding shares of each stock \(j=1,\dots, n\) is assumed to be constant in the market. We do not take into account share splitting or new issues. This implies that holders’ portfolios can not hold more shares at any time t than those outstanding:

$$ \sum_{i} n_{ij}(t) \le N_{j}\;,\quad j=1,\dots, m\,. $$

The above inequalities represent global constraints. These need to be satisfied during all stages of the execution of the randomization algorithms and at any time during the propagation of exogenous shocks through the portfolio network.

The exact values of Nj can not be retrieved easily. However, if we think of the holdings network as a small-scale representation of the market, we can introduce a scaling parameter c and imagine that the total value of each stock in holders’ portfolios initially is a fraction 1/c of its value in the whole market. In this spirit, and for the purposes of simulations, we make the following choice for the outstanding shares:

$$ N_{j} = c\,\sum_{i} n_{ij}(0)\;. $$

Parameter c captures the size of the network, in total assets terms, relative to the market: the larger c the more the holdings network is to be considered small with respect to its embedding market. Considering the total assets value of our test network (see “Dataset” section), and that the New York Stock Exchange had a market capitalization of 30.1 trillion USD as of February 2018, c=10 can be considered a sensible choice. We will also compare the results for this value to those for c=2 and c=100, to also account for scenarios where the holdings network represents a large or small fraction of the market respectively.

Market model of flow-induced trading

We are interested in studying how the aggregate fragility of the network is dependent on the overlap of portfolios and in testing interaction between portfolio diversification and overlap. We consider a case where market-wide negative events can determine mass investor outflows from funds. In such events, a fund will be forced to liquidate part of its asset positions in order to repay leaving investors. This trading activity will a have a (negative) impact on stock prices and this will cause funds that were not hit by the original event to experience losses afterwards.

Usually funds do not retain cash: a flow of investors into a fund is followed by an expansion of asset positions while an outflow is associated to a shrinkage of positions. Outflow corresponds to fund investors asking for redemption of their shares and to fund managers liquidating some assets to meet requests. Such trading activity by funds is referred to as flow-induced trading. We are interested in studying the network’s reaction to idiosyncratic shocks on a small time scale, when portfolio managers fire-sale assets with the effect of amplifying the shocks and triggering new ones. In particular we aim at measuring how fast the total value of the network is eroded and to what extent network fragility depends on asset commonality between portfolios. Accordingly, we consider a model where portfolio managers can only liquidate assets and we make the working assumption that the market outside the network is large and liquid enough to absorb such offer.

Consider the dynamics of a portfolio along a generic trading period. Let p(t−1) be the end-of-period TNA for period t−1 and assume that this value includes the fund’s flow along the period. If the portfolio’s composition was not altered, its return r(t) along period t would depend only on market price variations and the amount p(t−1) would grow at rate r(t) into the amount [ 1+r(t)] p(t). We define fund flow along period t the difference between the closing TNA and the accrued value:

$$f(t) = p(t) - [1+r(t)] p(t-1)\;. $$

Because flow-induced trading is needed to repay leaving investors, we assume that it comes with no modifications to the portfolios’ investment strategies, by which we mean that the stocks in the portfolios will remain the same. This implies that portfolio managers liquidate every asset in the same proportion (proportional selling). Let ηi(t) be the fraction of TNA that is liquidated along period t

$$\sum_{j} n_{ij}(t)\,s_{j}(t-1) = p_{i}(t-1) \left[ 1 + \eta_{i}(t) \right]\;. $$

Then proportional selling corresponds to solution nij(t)=[1+ηi(t)]nij(t−1) to the previous equation, with ηi(t)<0.

Flow-induced trading has an impact on the prices of the securities being traded. The price impactIj(x) of trading x shares of a stock is defined as the corresponding relative price variation. In the literature it is often assumed that a linear relationship exists (Kyle 1985) of the form Ij(x)=x/λj. The stock-specific parameter λj is called the stock’s market depth: the larger λj the more liquid the stock and the smaller the price impact of trading it. We assume that the market depth of a stock can be approximated by the total number of shares that have been issued and purchased by all participants in the market (outstanding shares). We take λj=Nj and such parameter accounts for heterogeneity in the liquidity characteristics of stocks. The price impact on stock j along period t will be

$$ I_{j}(t) = \frac{\Delta s_{j}(t)}{s_{j}(t-1)} = \frac{\sum_{i} \Delta n_{ij}(t)}{\lambda_{j}} = \frac{\sum_{i}\eta_{i}(t) n_{ij}(t-1)}{N_{j}} <0\;. $$

Note that λj=Nj and the scaling relationship (2) imply that the larger is c the smaller is the price impact of asset liquidations.

Taking into account both the price variations and asset liquidations, the portfolio TNA at end-of-period will be

$$p_{i}(t) = [1+\eta_{i}(t)] \sum_{j} \left[1+I_{j}(t) \right]\,n_{ij}(t-1)\,s_{j}(t-1)\;. $$

We can motivate a simple choice for the amount of assets liquidated. Consider an individual investor soon after a negative return ri(t−1) has occurred to her fund’s portfolio. The larger is ri, the higher is the chance that the investor will ask for redemption of one of her fund shares. Because 0≤|ri|≤1 we can interpret the absolute value of the portfolio (negative) return as a proxy of such probability. If we consider homogeneous investors and a uniform redemption probability across the fund’s shares, the average fraction of redemptions will be |ri(t−1)|. Thus, as a basic approximation and discarding fluctuations, we also take ηi(t)=ri(t−1).

To summarize, we consider a dynamics of shock propagation that proceeds as follows:

  1. 1

    A negative shock to stock prices \(\boldsymbol {\delta }(t) = (\delta _{1}(t),\dots,\delta _{n}(t))\) hits the market, with δj=Δsj/sj[ −1,0];

  2. 2

    This determines negative portfolio returns ri(t) for those portfolios that invest in the stocks hit;

  3. 3

    Negative returns determines an outflow from funds and a corresponding reallocation of portfolio positions nij(t)→nij(t+1)=[1+ηi(t+1)] nij(t) where ηi(t+1)=ri(t).

  4. 4

    Flow-induced asset selling has a negative price impact δ(t+1)=I(t+1) as given by Eq. (3) and then the process starts again from step 1:

$$\boldsymbol{\delta}(t)\longrightarrow \mathbf{r}(t) \longrightarrow \Delta n_{ij}(t+1) \longrightarrow \mathbf{I}(t+1) = \boldsymbol{\delta}(t+1)\;. $$

For our purposes we assume that the first shock δ(1) is exogenous: no outflow occurs during the first trading period and the first variation of the TNAs is completely determined by the negative returns.

Systemic damage

The dynamics described above implies that the value of every portfolio reduces by the end of the trading period, or it stays the same if the portfolio is not investing in the stocks that went through a price downturn. In the same spirit of Delpini et al. (2019), we consider the relative reduction in the total value of the portfolios \(D(t) = |{\sum _{i} p_{i}(t) / \sum _{i} p_{i}(t-1) -1}|\) as an indicator of network fragility. Comparison of the real network and its counterparts from random allocation scenarios will provide indication of what configuration is more robust in the light of global market constraints as well as insight into diversification–riskiness and differentiation–riskiness correlations.

A comprehensive measure of the fragility of the holding network would require to measure D with respect to all possible combinations of one or multiple shocks δj(1). This is impractical due to computational time constraints and sensible choices are needed to select appropriate values for δ(1). A shock to a single random stock, albeit large, is unlikely to produce large-scale effects in few trading periods. On the other hand a combination of very large shocks to many assets will produce a too fast and unrealistic degradation of portfolio values. A possible strategy is to consider worst case scenarios where a subset of targets is selected according to some criterion of asset network “centrality”. This was done in Delpini et al. (2019) where stress tests are performed applying a uniform shock to the most popular assets. There, the degree centrality k of a stock in the bipartite network was chosen to select the targets. Let us indicate this choice as the “k-targets” scheme. We will compare it with the following alternative choices.

Indeed it may not be obvious which measure of centrality is most appropriate. In principle, there could be assets that are very popular but account for a small fraction of the network’s value or, on the contrary, assets that are owned by few large holders with large portfolio weights. In order to take into account the actual monetary weight that stocks have in portfolios as well their popularity, we propose to alternatively select targets by maximizing the following indicator:

$$h_{j} = \left[ \sum_{i=1}^{m} \left(\frac{v_{ij}}{s_{j}\,N_{j}} \right)^{2} \right]^{-1} = \left[ \sum_{i=1}^{m} \left(\frac{n_{ij}}{N_{j}} \right)^{2} \right]^{-1}\;. $$

If holders owned all outstanding shares in the market, hj would account for the number of funds that own the largest portions of the stock’s market value, or simply the number of leading holders of stock j. Taking into account relationship (2), such a number actually gets multiplied by a uniform factor c2. Nevertheless we continue to think of it as a number of effective holders and, following this interpretation, we will say that the stocks with the highest hj are the “most owned” in the network. In this sense, hj is a rescaled version of the Herfindahl–Hirschman index, computed for stock j in terms of the ratios of the numbers of shares in each portfolio to the total number of outstanding shares. Let us indicate a scenario where target stocks are selected maximizing hj as the “h-targets” scheme. In both this scheme and the “k-targets” scheme we apply a uniform random shock δj[δmin,δmax) to the ntarget stocks with the largest values of kj and hj respectively. We take ntarget=0.1×n and arbitrarily set δmin=1% in order to produce an appreciable systemic damage.

Crises can come with negative returns that are widespread across a lot of stocks. In order to account for that, we also consider an “all-targets” scheme where a uniform shock δj[0,δmax) is applied to all stocks. This choice also free us from the need of arbitrarily selecting a subset of targets. In this case we allow stock prices to eventually be unchanged, that is δmin=0%.

Trading of a given stock on a stock exchange is suspended if the price relative loss exceeds a stopping threshold. This is dependent of the exchange regulation and usually less than 10%. We therefore set δmax=10% for all cases.

For every scheme and different underlying holding network, we perform nmc=100 Monte Carlo runs of shock propagation over T=10 trading periods. Systemic damage is computed as the average value of D over the runs.

Portfolio diversification and similarity

We measure diversification by the portfolio’s Herfindahl–Hirschman index

$$h_{i} = \left[ \sum_{j=1}^{n} \left(\frac{v_{ij}}{p_{i}} \right)^{2} \right]^{-1} = \left[ \sum_{j=1}^{n} w_{ij}^{2} \right]^{-1}, $$

which is interpreted as the number leading holdings in a portfolio.

Similarity between two portfolios can be quantified by their cosine similarity:

$$s_{i i^{\prime}} = \frac{\sum_{j} v_{i j}\,v_{i^{\prime} j}}{\lVert{\vec{v}_{i}}\rVert\, \lVert{\vec{v}_{i^{\prime}}}\rVert} = \frac{\sum_{j} w_{i j}\,w_{i^{\prime} j}}{\lVert{\vec{w}_{i}}\rVert\, \lVert{\vec{w}_{i^{\prime}}}\rVert}\;, $$

where \(\vec {v}_{i}=(v_{i\,1},v_{i\,2},\dots)\phantom {\dot {i}\!}\) is the vector of holdings of portfolio i and \(\vec {w}_{i}=(w_{i\,1},w_{i\,2},\dots)\phantom {\dot {i}\!}\) is the corresponding vector of portfolio weights. It is worth noticing that both indicators hi and \(\phantom {\dot {i}\!}s_{i i'}\) are independent of the portfolio’s TNA and can be equivalently expressed in terms of either the holding values or the portfolio weights. Throughout the paper, we refer to the subsidiary notion of differentiation between two portfolios. In symbols, we define it as the complementary quantity of similarity, or \(\phantom {\dot {i}\!}1-s_{i i'}\), and both quantities take values in [0,1]. The less two portfolios are similar, the more they are different.

Consider two portfolios of degree ki and \(\phantom {\dot {i}\!}k_{i'}\) that invest in a perfectly balanced way, with uniform positions vij=pi/ki and \(\phantom {\dot {i}\!}v_{i' j}=p_{i'}/k_{i'}\) if stock j is in the portfolio, and 0 otherwise. Their similarity will be \(s_{i i'} = k_{i i'}/ \sqrt {k_{i}\,k_{i'}}\phantom {\dot {i}\!}\) where \(\phantom {\dot {i}\!}k_{i i'}\) is the number of common assets. For a given degree sequence \(k_{1}, k_{2}, \dots, k_{m}\) the average similarity across the network is

$$\bar{s} = \frac{2}{m\,(m-1)} \sum_{i=1}^{m}\,\sum_{i^{\prime}=i+1}^{m} s_{i i^{\prime}} = \frac{2}{m\,(m-1)} \sum_{i=1}^{m}\,\sum_{i^{\prime}=i+1}^{m} \frac{k_{i i^{\prime}}}{ \sqrt{k_{i}\,k_{i^{\prime}}}}\;. $$

Consider also a network where every portfolio invests selecting uniformly at random the same number of stocks it owns in the real network. Since stocks are treated indifferently, it is natural to expect that such idealized investor would invest the same amount of wealth in every stock. We will refer to such limit scenario as an “unconstrained random holdings” network (URH). Because of the balanced positions, in this network investors maximize diversification conditionally on ki, achieving \(h_{i}=1/\sum _{j} (v_{ij}/p_{i})^{2}=1/\sum _{j'} (1/k_{i})^{2}=k_{i}\), where the second summation is limited to the stocks j that are actually present in the portfolio.

If we consider many realizations of a network like that, the expected value of the number of common assets will be \(\langle {k_{i i'}}\rangle \approx \frac {k_{i}\,k_{i'}}{n}\phantom {\dot {i}\!}\), where the approximation holds as long as the probability of choosing the same stock multiple times can be neglected. Under this approximation, the expected value of the average network similarity reads:

$$ s_{0} = \langle{\bar{s}}\rangle \approx \frac{2}{m\,(m-1)} \sum_{i=1}^{m} \sqrt{k_{i}} \left(\sum_{i^{\prime}=i+1}^{m} \sqrt{k_{i^{\prime}}} \right) \;, $$

and this can be considered a benchmark value with respect to which we evaluate deviations from randomness. As for the diversification, since in the URH case the degree sequence is constrained and each portfolio achieves the maximum diversification hi=ki, we see that the average network diversification is identically equal to the average degree \(h_{0} = \bar {h} = \sum _{i} h_{i} /m\). In the following it will be useful to compare different scenarios in terms of the network’s average diversification and similarity relative to the benchmarks h0 and s0.

Portfolio rebalancing

In a perfectly balanced portfolio every stock has the weight w=p/k. Conditionally on k, such a portfolio has maximum diversification h=k. If a holder, starting with an unbalanced portfolio, wants to increase her diversification without opening positions in new stocks, or conditionally on k, she can try to rebalance her positions in the original assets steering them toward the optimal level w=p/k. At first, some positions are larger than w, some are smaller and possibly some are just at the threshold. The portfolio’s TNA can be decomposed in the following way

$$\begin{array}{*{20}l} p &= \sum_{j} v_{j} = \sum_{j\in J^{+}} v_{j} + \sum_{k\in J^{-}} v_{k} + \sum_{l\in J^{0}} v_{l} \\ &= \sum_{j\in J^{+}} (w + \Delta v_{j}) + \sum_{k\in J^{-}} (w - \Delta v_{k}) + \sum_{l\in J^{0}} w \\ &= k\,w + \sum_{j\in J^{+}} \Delta v_{j} - \sum_{k\in J^{-}} \Delta v_{l} \\ &= p + \sum_{j\in J^{+}} \Delta v_{k} - \sum_{k\in J^{-}} \Delta v_{k}\;, \end{array} $$

where J+, J and J0 are the sets of stocks corresponding to redundant, deficient or balanced positions, while Δvj is the difference between the actual position on stock j and the balanced position w. From the previous relationship it follows that

$$\sum_{j\in J^{+}} \Delta v_{j} = \sum_{k\in J^{-}} \Delta v_{k}\;, $$

which simply states that the total excess equals the total defect. A holder can liquidate redundant positions and use the liquidity to buy more shares and balance defective positions. In the absence of limits on the number of shares that can be bought or sold, the holder will achieve perfect diversification. Otherwise, she will end with a suboptimal portfolio but the value of h will have increased anyway.


Portfolios can be similar with respect to how many stocks they invest in, what stocks and how much money is invested in each one. We introduce two synthetic models where the original holdings are randomized with different strategies. We will refer to them by the symbols H1 and H2 respectively, distinguishing them from the original holdings network Hor. In both of them we assume portfolio TNAs and portfolio degree sequences as given. In other words, we focus on how stocks are chosen, conditionally on the number of stocks ki and the portfolio wealth pi.

In model H1, each holder goes through a process of replacement of its original stocks with new ones chosen uniformly at random. We refer to this as a “shuffling of holdings”. The degree sequence of stocks is not preserved and the degree of each stock is approximately a binomial random variable with mean |E|/n. As discussed in Delpini et al. (2019), the degree distribution of the real assets decays slowly and there exist very popular assets owned by thousands of portfolios. In order to account for the role of these hubs, we also consider model H2. It differs from H1 in that stock replacement is performed by means of a double-edge swap strategy that preserves the degree sequence of stocks as well as that of holders. The number of swaps to be performed is fixed upfront and expressed as a fraction f of the number of holdings, that is nswap=f×nh.

In both models the ideal investor uses no information during stock selection and treats stocks as being equivalent, if it was not for the number of available shares that is heterogeneous and subject to changes during the allocation process. For both scenarios H1 and H2 we consider two variants. In the first one holders tries to re-allocate exactly the same amount of money of each original position. This case will allow us to compare scenarios for equal portfolio concentrations and will be referred to as an “unbalanced scenario”. In the second one funds try to achieve the highest degree of diversification possible by going through the rebalancing process discussed in the previous section. In a way, this is also coherent with the idea of a random allocation: since stocks are treated equivalently, an investors will have no reason to make unbalanced positions, provided that enough shares are available from the market to buy. Rebalancing is performed after randomization: first a random network is generated where portfolio positions are exactly the same as the original ones, be it H1 or H2, and positions are rebalanced afterwardsFootnote 1. We will consider a rebalanced network for the original topology as well and we will indicate the three balanced cases as Hor,b, H1,b and H2,b respectively to distinguish them from their original or unbalanced counterparts.

The algorithms for rebalancing and randomization that we use are required to enforce the global constraints (1) at each stage of their execution. They ensure that when holders open new positions no more shares can be bought of a stock than actually available on the market at that very moment. This aspect represents a major contribution of our work. Indeed, the simple randomization algorithms in Delpini et al. (2019) preserved portfolio TNAs and degrees, but they were otherwise unconstrained. Because no constraints were imposed on the numbers of shares, the total value of every asset in the network of portfolios was allowed to change. Those routines were instrumental in comparing the actual network topology with its random counterparts but the corresponding network models are to be considered approximations valid in the limit of infinite numbers of shares. On the contrary, the n global constraints (1) are required to be satisfied at any time here. In particular, they limit the maximum diversification holders can reach without altering their degree and this can be critical for systemic risk assessment.

Both in scenario H1 and H2, one holding at a time is replaced. When position vij is to be re-allocated, vij/sj shares are sold of stock j and \(v_{ij'}/s_{j'}\phantom {\dot {i}\!}\) shares need to be bough of stock j from the market. Since the number of outstanding shares is limited, the random re-allocation of an individual portfolio can fail. It can happen that the shares of j that are available on the market to buy at a given time are not sufficient to re-allocate the desired position. It can also be the case, in scenario H2, that the stock that was sampled as a replacement can not be swapped with the original one without either violating the global constraints or changing the stocks’ degree sequence. In such cases, a failure is registered and a new stock is randomly sampled for replacement. When a limit number of consecutive failures is reached during the allocation of a given portfolios, the whole randomization process is reset and restarted from the very beginning after a reshuffle of all stocks and holders. This is done to stochastically escape configurations that would not allow full re-allocation of all portfolios and to find possible solutions that may be reached through a different sequence of holding replacements.


We first provide a comprehensive summary of the statistical features of the original network and the balanced and/or random scenarios, in terms of portfolio diversification and similarity. We also compare them with the unconstrained random benchmark URH introduced previously and provide a nice model representation in a two-dimensional parameter space. Then we show the results of shock propagation across the different topologies and draw conclusions about the relationship between systemic risk, the selective or unselective nature of the shocks, the degree of diversification and similarity of the portfolios and also the relative size of the network to the market.

All routines used to generate random scenarios suffer from round-off errors to some extent. This means that even though the algorithms are guaranteed to preserve portfolio TNAs and total network value exactly, numerically they do not. In the case of unbalanced scenarios, the h index of portfolios is not preserved exactly either. We checked numerical accuracy by looking at the following quantities. As for the total network value and the TNAs, we considered the absolute differences \(\phantom {\dot {i}\!}M_{\text {err}} =|{\sum _{i j} v_{ij} - M_{\text {or}}}|\) and \(\phantom {\dot {i}\!}p_{\text {err}}=\underset {i=1,\dots, m}{\max } |{\sum _{j} n_{ij}\,s_{j} - p_{i,\text {or}}}|\). For the diversification index, we computed the relative error \(h_{\text {err}}=\underset {i=1,\dots, m}{\max } |{h_{i}/h_{i,\text {or}}-1}|\phantom {\dot {i}\!}\) instead. These errors turn out to be negligible with respect to the order of magnitude of the corresponding quantities involved in the analysis, see Table 1. We also include in the table the average relative increase of diversification and the fraction of edges that are different from the original network. These quantities are defined as \(\phantom {\dot {i}\!}h_{\text {incr}}=\sum _{i} [ (h_{i} - h_{i,\text {or}})/h_{i,\text {or}} ]/m\) and \(\phantom {\dot {i}\!}E_{\text {diff}}=|{E \cap E_{\text {or}}}|/|{E_{\text {or}}}|\) respectively. Rebalancing guarantees a large increase of diversification in all cases. On average, it allows portfolios to more than double their initial value of h. The larger c, the larger the diversification gain, as expected because a large c increases the probability for a holder to attain perfect rebalancing of her positions. As for model H2, changing f from 1 to 10 does not provide a significant diversification gain. The value of Ediff measures how much the topology of the network has been altered by the randomization procedure. We register a turnover of holdings between 68% and 89%. For given c, model H2 has a slightly lower turnover because the random selection of new holdings is limited by the requirement of preserving the stocks’ degree sequence. Increasing f to 10 allows to reduce the turnover difference with respect to model H1 and will be assumed as the default in the following discussion of the results.

Table 1 Numerical errors affecting the relevant quantities in the balanced and random scenarios

In Table 2 we provide a 5-number summary of the statistical distribution of h, plus its interquartile range and mean value \(\overline {h}\). The first, second and third quartiles, and the 9th and 91st percentiles are reported.

Table 2 Summary statistics of the portfolio diversification in the real network and its counterparts in the balanced allocation scenarios

The distribution broadens after rebalancing and the mean more than doubles. Most evidently the distribution’s right tail grows fatter and the upper percentile q (0.91) can even increase threefold depending on the randomization model and parameters. In Table 3 the same summary is provided for the distribution of the portfolio similarity. We omit the lower percentile and the first quantile because they are zero for all cases.

Table 3 Summary statistics of the portfolio similarity in the real network and its counterparts in the random and balanced allocation scenarios

Indeed, many portfolios do not overlap at all with many others and the corresponding similarity matrix is sparse. If we discard zeros, we obtain the conditioned similarity distribution that is summarized in Table 4.

Table 4 Summary statistics of the conditioned portfolio similarity in the real network and its counterparts in the random and balanced allocation scenarios

Shuffling of holdings suppresses average similarity the most. However, portfolio overlap is sensibly reduced in model H2 too. Portfolio rebalancing has mixed effects: it reduces the average similarity in the real network and increases it if performed after randomization of the holdings. We also note that a larger c typically results in a lower similarity because it corresponds to more outstanding shares and weaker constraints on random stock selection. Moreover, performing more swaps by increasing f in model H2 further reduces \(\bar {s}\) in the unbalanced case, but its effects are nearly undetectable after a portfolio rebalance. In Fig. 1 we summarize part of the previous information for the diversification and the conditioned similarity by means of boxplots limited to scenario c=f=10.

Fig. 1
figure 1

Boxplots of the diversification and similarity distributions in the case c=f=10. As for the similarity, the conditional distribution for sij≠0 is represented. The ends of the whiskers correspond to the 9th and 91st percentiles

These plots convey the information that portfolio rebalancing stretches the diversification distribution towards larger values, and that the real network topology and its balanced version are characterized by large similarities compared to models H1 and H2. It also shows that all distributions are skewed.

It is useful to represent all models in a parameter space that is specific for the network at hand. Its coordinates are the average values of the diversification index and of the portfolio similarity across the network. We take as a benchmark for these quantities their expected values for a URH network with the same fund degree sequence of Hor, see Eq. (4) and the discussion about the perfectly balanced random holdings network model. For our case network such values are h0=157.4 and s0=0.011 respectively. The first is an exact value because it is completely determined by the funds’ degree sequence, while the second value was obtained as a sample mean of the average network similarity for 100 Monte Carlo realizations of model URH. We then consider rescaled coordinates \(\bar {h}/h_{0}\) and \(\bar {s}/s_{0}\) for all scenarios under investigation. These are useful to see at a glance how much actual values deviate from a completely random case. The original network, the balanced and randomized counterparts are then represented as in Fig. 2, where the marker size is increasing with the value of the scaling factor c.

Fig. 2
figure 2

Representation of the holdings network Hor and its random and balanced counterparts. The coordinates of the parameter space are the normalized values of the network average diversification and similarity. Marker size reflects increasing values of the scaling factor c from c=2 to c=100. Here Hor is associated to the value of c=1 just the purpose of representation. The points aligned vertically correspond to the real network and its randomized versions H1 and H2, while the points on the right correspond to the balanced scenarios Hor,b, H1,b and H2,b

The plot supports the previous considerations about the statistical distributions of the h and s and provides some more insight. In the original network and the randomized scenarios without a rebalancing, the average diversification is considerably smaller than for a URH network. The balanced models get a higher diversification that converges to the benchmark h0 for increasing c. The real network exhibits an average similarity that is more than five times larger than s0, in agreement with the findings of Delpini et al. (2019). Model H2 preserves the strongly connected stocks but still the average similarity is reduced noticeably. Model H1 has similarity close to s0. Rebalancing increases \(\bar {s}\) for both H1 and H2 and the similarity gain is decreasing with c. Noteworthy, it has a slightly suppressive effect for the original network, as noted earlier. This indicates that holders correlate with respect to both the choice of which stocks to buy and their proportions, and that unselective rebalancing of all positions mitigate such correlation. We also notice that for large c, scenarios Hor,b and H2,b tend to converge to the same region of the parameter space. Overall, this plot shows that it is difficult to steer the real network toward benchmark values of both diversification and similarity: the holding reshuffling brings the average similarity closer to h0 but the rebalancing needed to raise \(\bar {h}\) comes with a new similarity gain.

A major goal of this analysis is the study of systemic fragility in the light of the global market constraints and the different diversification and similarity profiles of the considered scenarios. To this end we performed simulations of the flow-induced trading dynamics of shock propagation. This was done for three kinds of exogenous shocks corresponding to the schemes “k-targets”, “h-targets” and “all-targets” introduced in “Systemic damage” section. In Fig. 3 the damage curves of the original network and the random unbalanced scenarios are compared for varying c.

Fig. 3
figure 3

Systemic damage provoked by three types of random shocks for increasing values of the scaling parameter c

The actual fragility crucially depends on how the exogenous shock is applied and the relative market size of the network. When the most popular stocks get hit and c=10, Hor and H2 are the most risky scenarios and undergo similar damages. The shuffled network H1 is significantly safer. As H2 preserves strongly connected stocks while shuffling avoids them, we guess that in this case the effects of hubs dominate. When the shock affects the most owned stocks (those with the highest h), the systemic damage reduces, especially in scenarios H1 and H2, while the real network is the most risky. Under scheme “h-targets”, weights play a crucial role and the gap between Hor and H2 is most evident. However, when all stocks in the market receive a shock (“all-targets” case), we observe a reversal of the damage curves, with Hor becoming the safest configuration and H1 the most fragile. This is a major point, because market crashes can come with negative returns that are widespread across stocks. The figure also shows that the systemic damage becomes saturated after just few periods, which supports consideration of our framework for stress testing holdings networks over short time horizons. Similar trends are observed when c=100 but with a reduced damage, as expected since stocks have larger market depths. When c=2, the network accounts for half the market and Hor still provides the most robust scenario in front of a shock that interests all the stocks. Unexpectedly, in this case it is also the most robust under the “k-targets” stress test.

What are the systemic effects of making portfolios less concentrated? Figure 4 shows the damage ratio for each balanced scenario to its unbalanced counterpart and provides an answer.

Fig. 4
figure 4

Ratio of the systemic damage in each balanced scenario to the damage in the corresponding unbalanced configuration, with respect to the three kinds of random shocks and for increasing values of the scaling parameter c

Again, it depends on the network topology and portfolio weights distribution, as well as strongly on how we stress the system. For targeted shocks, rebalancing real portfolios is beneficial as long as the network is relatively small, while for c=2 a more diversified network with the real topology eventually becomes more risky. In front of a widespread shock, rebalancing is always detrimental to Hor after a while. The systemic loss is between few points and 30% larger than it was for the original network, depending on c, and such negative systemic effect persists over time. Turning to random model H1, rebalancing has slightly positive effects under unselective shocks. Under targeted shocks, increasing diversification results in a more risky network, with the exception of c=10 where H1,b turns safer after a transient. Model H2 behaves qualitatively like Hor in the “k-targets” stress test, while the effects of rebalancing for this model are more similar to H1 in the other cases. Noteworthy, when we target the most owned stocks, the negative effects of rebalancing for H2 are severe and persistent over time for all values of c. It also worth noting that in some circumstances, the effect of rebalancing depends critically on the number of periods elapsed. It can start positive and become negative after a transient, or the other way round.

We can abridge the most of these findings in the representations of Figs. 5 and 6.

Fig. 5
figure 5

Systemic damage versus the normalized average diversification of the corresponding network at T=5 for c=10

Fig. 6
figure 6

Systemic damage versus the normalized average similarity of the corresponding network at T=5 for c=10

In the first figure the systemic loss after half the trading periods is shown against the normalized network diversification. No evident correlation can be detected in this case and, to parity of diversification, fragility is strongly mediated by the type of shock applied and the relative size of the network. In Fig. 6 a comparison of the systemic loss and the corresponding normalized network similarity is performed.

We still have a strong interplay between fragility, network size and shock type. However, a positive correlation between the systemic damage and the average similarity is observed under targeted shocks, with a Pearson correlation coefficient of ρk=0.3 and ρh=0.6 respectively. Such correlation becomes negative under unselective shocks, corresponding to ρall=−0.3. This figure delivers useful insight on the effects of investment similarities and we believe that the previous findings represent a major contribution of this work.


Nowadays, it is of the utmost importance to quantify riskiness in financial systems, especially when investments in foreign assets can provide a global outreach to the propagation of financial distress. Depending on market conditions and the simultaneous activity of investors, a systemic risk component may emerge if diversification strategies are similar. Such effect was suggested in Delpini et al. (2019), and supported by simulations of distress propagation in the holdings network of US mutual funds. The random models behind that analysis were admittedly simple. It was assumed that funds can reallocate their positions freely, without any constraints on the amount of shares available from the market. Also, that analysis did not take into consideration the possibility for asset managers to increase diversification by balancing positions without modifying the network’s topology.

In this paper, we improved and extended the previous study of a bipartite network of portfolio holdings. We adopted a more general point of view and modeled the network as a subsystem of the whole market. We explicitly took into account the global constraints posed by the limited numbers of shares in the market. Such number changes from stock to stock and accounts for heterogeneity in stock liquidity characteristics. We exploited more sophisticated algorithms that are required to satisfy such global constraints at any time of their execution. By their means, we could generate synthetic scenarios of both balanced and unbalanced portfolios and perform an extensive computational analysis of how such scenarios react to different types of exogenous shocks. We considered a dynamics of distress propagation where the numbers of shares determine the sensitivity of each stock’s price. Such parameters are the same for the original network and its random counterparts, which now allows to perform a more consistent comparison. For every scenario we also simulated its balanced counterpart, representing a case where asset managers try to rebalance all their positions in the quest for a higher degree of diversification.

We found that randomization of the original holdings for fixed portfolio diversification has a strong suppressive effect on similarity even under the global constraints. This effect becomes stronger as the relative market value of the network decreases. We showed that a large increase of the average diversification can be achieved by portfolio rebalancing, even for a moderate number of outstanding shares. We also provided a convenient representation of scenarios in a diversification–similarity space of coordinates relative to an unconstrained random model of balanced investments. With respect to such benchmark, the real holdings network has small average diversification but strong average portfolio similarity, and a significant increase in both differentiation and diversification can not be achieved trivially.

We then performed an extensive comparison of the different scenarios in terms of the systemic damage from an exogenous shock applied to the corresponding network. We considered both the case of targeted shocks to the most popular or most owned stocks, as well as a widespread random shock to the prices of all stocks. Results show that there is an interplay between diversification and investment differentiation, which varies across the different network topologies. Network’s fragility can depend to a large extent both on the way shocks are applied and on the relative size of the network, and results can be unexpected to a degree. Overall, a correlation is found between systemic risk and portfolio similarity. Such correlation is positive under targeted shocks and negative under unselective ones. A rebalancing of the real portfolios makes the network less prone to large damages in front of targeted shocks, provided that the network can be considered small with respect to the market. Remarkably, that also makes it riskier under widespread shocks. This means that increasing diversification may possibly be detrimental in systemic terms depending on market conditions. An initial transient of time where rebalancing changes from beneficial to detrimental, or the opposite, can be observed. Additionally, our results show that holdings reshuffling do not automatically results in a safer network. In particular, in a scenario where all stocks may undergo a negative downturn, the real network provides a safer environment regardless of its relative size.

We believe that our findings can be of major interest when it comes to assessing systemic risk properly and for informing effective policy actions. As a further perspective, the effects of diversification-increasing strategies that change the holders’ degree sequence could be investigated. Also, a higher-statistics Monte Carlo study of each random scenario for a fine-grained grid of model parameters would provide a better coverage and broader insight.

Availability of data and materials

The findings of this study are based on data from the Survivor-Bias-Free US Mutual Funds database 2012 Center for Research in Security Prices (CRSP), The University of Chicago Booth School of Business. Restrictions apply to the availability of these data, which were used under license and so are not publicly available.


  1. The algorithms may be adapted to perform random allocation and rebalancing simultaneously. This increases algorithmic complexity and execution times and more importantly makes it difficult to guarantee a diversification increase upfront.



Total net assets


Unconstrained random holdings


US dollar


Download references


Not applicable.


DD acknowledges funding “Fondo di Ateneo per la Ricerca 2019” provided by the University of Sassari.

Author information

Authors and Affiliations



DD devised the modeling aspects and the visualizations, wrote the algorithms, performed the calculations and the simulations. All authors conceived the idea, analyzed and interpreted the results, and wrote the manuscript. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to Danilo Delpini.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Delpini, D., Battiston, S., Caldarelli, G. et al. Portfolio diversification, differentiation and the robustness of holdings networks. Appl Netw Sci 5, 37 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: