 Research
 Open Access
 Published:
Information cascade final size distributions derived from urn models
Applied Network Science volume 8, Article number: 30 (2023)
Abstract
Bipolarization is a phenomenon in which either a large or very small information cascade appears randomly when the retweet rate is high. This phenomenon, which has been observed only in simulations, has the potential to significantly advance the prediction of final cascade sizes because forecasters need only focus on the two peaks in the final cascade size distribution rather than considering the effects of various details, such as network structure and user behavioral patterns. The phenomenon also suggests the difficulty of identifying factors that lead to the emergence of largescale cascades. To verify the existence of bipolarization, this paper theoretically derives mathematical expressions of the cascade final size distribution using urn models, which simplify the diffusion behavior of actual online social networks. Under the assumption of infinite network size, the distribution exhibits powerlaw behavior, consistent with the results of existing diffusion models and previous Twitter analytical outcomes. Under the assumption of finite network size, bipolarization is observed.
Introduction
A largescale information cascade is a phenomenon in which attractive (viral) content spreads to a large number of online social network (OSN) users. The problem of predicting the final size of a cascade when its size is small has been studied for more than twenty years, where the size is the number of users who have shared the information. Some predicting approaches focus on community properties (Weng et al. 2013, 2014; Junus et al. 2015; Bao et al. 2017). Other approaches focus on the details of the cascade and network structures (Zhao et al. 2015; Yu et al. 2015; Li et al. 2015; Krishnan et al. 2016; Cheung et al. 2017). Recent approaches often employ deep neural network technologies (Bourigault et al. 2016; Wang et al. 2017, 2018; Horawalavithana et al. 2020).
If the prediction problem can be solved, then (1) content delivery systems can be made more efficient by moving viral contents to servers close to the viewers, (2) information that people are interested in can be used more quickly for stock investments and product development, and (3) prediction technologies can be applied to viral marketing, which creates largescale cascades on purpose.
Largescale cascades rarely occur (Goel et al. 2012; Cheng et al. 2014), so available datasets for their prediction are currently insufficient. As such, it is not always easy to apply machine learning techniques and verify the statistical significance for proposed methods. Meanwhile, simulationbased forecasting research is progressing. A phenomenon in which large and very small cascades appear randomly under a high retweet rate has been reported in Oida (2021). This phenomenon is referred to as bipolarization because the two peaks of the cascade size distribution move apart as the retweet rate increases.
Bipolarization is universal in that it emerges regardless of network topologies (smallworld (Watts and Strogatz 1998), scalefree (Barabási et al. 1999), Erdős–Rényi (Batagelj and Brandes 2005), and existing OSNs), existence of various types of communities (Forestier et al. 2015; Baldesi et al. 2018), and user behavioral patterns (social reinforcement (Weng et al. 2013, 2014), user response times (Zhou et al. 2017; Xie et al. 2011), and repeated exposure (Zhou et al. 2015)). This phenomenon was observed in eventdriven simulations and was reproduced by the urn model, which is a model for simplifying the Twittertype information diffusion mechanism (Oida 2021). Bipolarization appears when the network size is finite.
This paper mathematically formulates the urn model to theoretically verify this novel discovery. The contributions of this paper can be summarized as follows:

1
This paper deals analytically with the case where the network size is finite (although the case where the size is infinite is much simpler). This is a realistic approach because existing network sizes are all finite. The effects of network properties and user behavior can be rigorously assessed through partially modifying the derived equations in this paper.

2
This paper is the first to theoretically prove the bipolarization phenomenon.

3
The formula for infinite network size can be applied to the case where most of the cascades are sufficiently smaller than the network size. The cascade size distribution derived from the formula shows a power law over a certain range. This is consistent not only with the results of previous Twitter data analyses (Bakshy et al. 2011) (because most existing cascades are small) but also with those of existing diffusion models (Wegrzycki et al. 2017; Gleeson et al. 2020).
The remainder of this paper is organized as follows. Section describes related studies. Section introduces the urn model. Section formulates the model as a Markov chain. Section proves many propositions derived from the Markov chain. Section numerically evaluates the derived equations to investigate the shape of the cascade final size distributions under the finite and infinite network size assumptions. Section discusses the implication of the findings, and finally, Sect. concludes the paper.
Related work
The term “urn models” generally refers to the systems of one or more urns containing objects of various types (colored balls in the usual setting) (Mahmoud 2008). The systems evolve in time, subject to rules of drawing balls and throwing balls into the urns. These models have helped reveal various phenomena, including the evolution of species in biology, particle systems in chemistry, and formation of social networks in sociology (Mahmoud 2008).
A variety of urn models have been proposed for years, especially in the field of mathematics (Pemantle 2007). In recent reinforcement models, one or more balls are extracted and returned to the urn with additional balls. The number and colors of the additionally returned balls are determined by the colors of extracted ones (Rafik et al. 2019; Crimaldi et al. 2022). These studies focus primarily on asymptotic properties, such as the limit value of the proportion of balls of a certain color in an urn. The model in this paper is simple in that one ball is extracted and one ball is returned; therefore, it is not a reinforcement model. The uniqueness of this study is that a ball to be returned may not be an extracted one and that the number of trials of extracting a ball is increased, not the number of balls to be returned.
The reinforcement model is also used to model various information diffusion phenomena or to predict them through numerical computation. In (Hino et al. 2016), Pólya’s urn model was introduced to observe phase transitions in the information cascade. The model was also used for reproducing trajectories of innovation diffusion (Dosi et al. 2019) and for predicting statistical laws for the rate at which novelties (e.g., discovery of new songs and ideas) happen through social interaction (Tria et al. 2014; Di Bona et al. 2022).
Let us next discuss previous studies focusing on information cascade sizes. The authors of Wegrzycki et al. (2017) presented a theoretical proof that the sizes of cascades generated by the cascade generation model (CGM) (Leskovec et al. 2007) follow a powerlaw distribution. In Gleeson et al. (2020), information spread was modeled with a branching process, which also presented powerlaw behavior over a limited range. Both approaches are practical in that their models were effective for fitting empirical data. However, they did not clearly quantify the effects of finiteness of the network size on cascade sizes. This paper extends their work by articulating the impact of the finiteness from a different perspective.
Proposed model
Figure 1(left) shows the urn model that represents the mechanism of spreading retweets to followers. Table 1 describes the symbols in the model and corresponding OSN quantities. There are N balls in the urn, and each is either black or white. The OSN quantity corresponding to a black (white) ball is a user who has (not) posted a retweet. Thus, the initial condition is that all balls are white. The model repeats a trial in which a ball is randomly extracted from the urn and then a ball is returned to the urn until \(p=0\), where p is the number of remaining trials, and its initial value is \(f (>0)\). As shown in Table 1, p corresponds to the number of retweet messages that have not been read yet and f the number of followers.
In the OSN diffusion mechanism, a user posts a retweet with probability \(\lambda\) after reading an arrival retweet if the user has not posted a retweet yet. In this case, the number of unread retweets p increases by \(f1\) (because the user has read the retweet and the number of followers is f), and the number of users who have retweeted B increases by one. If a retweet message arrives at a user who has already posted a retweet or at a user who decides not to post a retweet, p decreases by one and B is unchanged. The number B at \(p=0\) corresponds to the final size of the cascade.
The urn model in Fig. 1(left) follows the abovementioned OSN diffusion mechanism. If a white ball is taken out of the urn, a black ball is returned to the urn with probability \(\lambda\) (this ballswapping rule represents the process in which a user who has not retweeted turns into a user who has retweeted). If this happens, p is incremented by \(f1\). In all other cases, the extracted ball is returned and p is decreased by one.
Formulation
This section defines a stochastic process from the urn model. Let \(p_n\) and \(B_n\) be p and B immediately after the nth trial, respectively. The stopping time \(\tau\) of the trial is defined as
\(B_{\tau }\) is determined by the Markov chain \(X_n=(B_n,p_n)\) given by
where \(\Delta _{n}\in \{0,1\}\) and \(\Delta ^{\prime }_{n}\in \{1,f1\}\) represent increases in B and p just after the nth trial, respectively. Note from the previous section that \(\{\Delta _{n}=0\}=\{\Delta ^{\prime }_{n}=1\}\), \(\{\Delta _{n}=1\}=\{\Delta ^{\prime }_{n}=f1\}\), and
This paper assumes that \(\lambda\) is constant and f may vary with time. Let \(f_k\) be the kth value of f (\(f_0\) is the initial value of p). In the OSN setting, \(f_k\) corresponds to the number of followers of the user who posted the kth retweet. Let
where \(\bar{f_k}:=\sum _{i=0}^{k}f_i\) and \(I_0:=(0, {\bar{f}}_0]\). Figure 1(right) shows one of the cases where \(B_{\tau }=5\) occurs. This event occurs if and only if events \(\Delta _n=1\) do not occur at \(n \in I_5\), they occur five times at \(n \in \cup _{k=0}^4 I_{k}\), at least four times at \(n \in \cup _{k=0}^3 I_{k}\), at least three times at \(n \in \cup _{k=0}^2 I_{k}\), at least twice at \(n \in \cup _{k=0}^1 I_{k}\), and at least once at \(n \in I_{0}\).
Proposition 1
For any integer \(k\ge 0\),
Proof
As shown in Fig. 1(right), event \(p=0\) occurs only at the end of one of intervals \(I_0, I_1, I_2, \ldots\). If \(B_{\tau }=k\), the trials are not conducted at \({\bar{f}}_j\), \(j>k\) and \(\tau\) is not smaller than \(\bar{f_k}\) because if \(\tau ={\bar{f}}_j\), \(j<k\), \(B_{\tau }\) must be smaller than k. Thus, \(\{B_{\tau }=k\} \subset \{\tau =\bar{f_k}\}\). If \(\tau =\bar{f_k}\), \(B_{\bar{f_k}}=k\). Thus, \(\{\tau =\bar{f_k}\} \subset \{B_{\tau }=k\}\). \(\square\)
Calculation of \(P(B_{\tau })\)
Infinite network size
This section calculates the probability of \(B_{\tau }\) when the total number of balls N (referred to as the network size) is infinite. Proposition 1 indicates that \(\{B_{\tau }=k\}\) depends only on the first \(\bar{f_k}\) trials \(\Delta _1,\ldots ,\Delta _{\bar{f_k}}\). Because \(\Delta _{n}\in \{0,1\}\), \(\omega =(\Delta _1,\ldots ,\Delta _{\bar{f_k}})\) takes one of \(2^{\bar{f_k}}\) different binary sequences. Let \(\left( \begin{array}{c}{a} \\ {b\,\,c}\end{array}\right)\) be the number of events \(\omega\) that satisfy the condition that a trials generate b events \(\Delta _{n}=1\) and \(\tau =c\). Then,
If N is infinite, the righthand side of (4) is \(\lim _{N\rightarrow \infty } \frac{Ni}{N} \lambda =\lambda\), so \(P(\Delta _{n+1}=1B_n )\) is constant \(\lambda\) regardless of \(B_n\). Thus, by using \(X_0=(0,f_0)\),
and for \(k \ge 0\),
where \(\left( \begin{array}{c}{\bar{f_0}} \\ {0\;\;\bar{f_0}}\end{array}\right) =\{\omega B_{\tau }=0\}=1\).
Proposition 2
For any integer \(k>0\),
Proof
From Proposition 1,
Because \(p_n\) becomes zero only at \(n=\bar{f_0}, \bar{f_1}, \bar{f_2}, \ldots\),
Note that \(\{\tau ={\bar{f}}_m\}\), \(m=1,2,\ldots ,k\), are disjoint sets. From (11) and (12),
Because
(10) holds. \(\square\)
Proposition 3
For any integers k and m satisfying \(0< m < k\),
Proof
\(\left( \begin{array}{c}{{\bar{f}}_{k1}} \\ {k\;\;{\bar{f}}_m}\end{array}\right)\) is the number of binary sequences \((\Delta _1,\ldots ,\Delta _{{\bar{f}}_{k1}})\) in which events \(\Delta _{n}=1\) occur k times and \(\tau ={\bar{f}}_m\). The number of binary sequences \((\Delta _1,\ldots ,\Delta _{{\bar{f}}_m})\) satisfying \(\tau ={\bar{f}}_m\) is \(\left( \begin{array}{c}{{\bar{f}}_m} \\ {m\;\;\bar{f}_m}\end{array}\right)\), and the number of sequences \((\Delta _{\bar{f}_m+1},\ldots ,\Delta _{{\bar{f}}_{k1}})\) in which events \(\Delta _n=1\) occur \(km\) times is \(\left( {\begin{array}{c}{\bar{f}}_{k1}{\bar{f}}_m\\ km\end{array}}\right)\). \(\square\)
Propositions 2 and 3 indicate that \(\left( \begin{array}{c}{\bar{f_k}} \\ {k\;\;\bar{f_k}}\end{array}\right)\) is obtained by using \(\left( \begin{array}{c}{{\bar{f}}_m} \\ {m\;\;{\bar{f}}_m}\end{array}\right)\), \(m=0,1,\ldots ,k2\), where \(\left( \begin{array}{c}{f_0} \\ {0\;\;f_0}\end{array}\right) =1\). Therefore, from (9), \(P(B_{\tau }=k)\) can be derived in ascending order of k.
Finite network size
Let us next consider the case where N is finite, and let \(\lambda _i:=\frac{Ni}{N} \lambda\). In this case, probability \(P(B_{\tau }=k)\) becomes a complex formula, and its calculation time rises sharply as k increases. Therefore, the upper and lower bounds of \(P(B_{\tau }=k)\) are derived as a first step.
Assume that events \(\Delta _n=1\) occur k times at \(n=n_1,n_2,\dots ,n_{k}\), where \(1 \le n_1<n_2<\cdots <n_{k} \le \bar{f}_{k1}\). Let \(c=\left( \begin{array}{c}{\bar{f_k}} \\ {k\;\;\bar{f_k}}\end{array}\right) \prod _{i=0}^{k1}\lambda _i\) and
Proposition 4
For \(k>0\), \(P(B_{\tau }=k)\) satisfies
Proof
Let \(\omega ^*\) (\(\omega _*\)) be one of the \(\omega\) values in \(\{\omega B_{\tau (\omega )}=k\}\) that maximizes (minimizes) probability \(P(\omega )\). From (7),
In the following, \(P(\omega ^*)\) and \(P(\omega _*\)) are derived. From (4),
which is independent of \((n_1,\dots ,n_{k})\). For any \(\omega\), \(P(\omega )\) is given as
From (4), for any ktuple \((n_1,\ldots ,n_{k})\), \(L(n_1,\ldots ,n_{k})\) strictly decreases as \(n_i \in \{n_1,\ldots ,n_{k}\}\) increases. Note that \(B_{\tau }=k\) if \((n_1,\dots ,n_{k})\) is equal to \((1,2,\ldots ,k)\) or \(({\bar{f}}_0,\bar{f}_1,\ldots ,{\bar{f}}_{k1})\), and \(B_{\tau }\ne k\) if \(n_j>\bar{f}_{j1}\) for any j, \(1\le j \le k\). Accordingly, L is minimized (maximized) if events occur at \((n_1,\dots ,n_{k})=({\bar{f}}_0,\bar{f}_1,\ldots ,{\bar{f}}_{k1})\) (\((n_1,\dots ,n_{k})=(1,2,\ldots ,k)\)). \(\square\)
Note that
This paper further improves the upper and lower bounds in (19) by using the following propositions. Let
Proposition 5
For \(0< j \le k\), L defined in (18) satisfies
Proof
Similarly,
\(\square\)
Proposition 5 implies that an increase in \(n_j\) by one results in a decrease in L by \(\delta (N,\lambda ,j) \times 100\%\). The function \(\delta\) rises (falls) with an increase in \(\lambda\) (N or j). Because N and \(\lambda\) are constant in this urn model, j determines \(\delta\). From (25), however, the impact of j on \(\delta\) may be small because \(j < N\) and \(\lambda \ll 1\).
Let \(L^*:=L(1,2,\ldots ,k)\) and \(L_*:=L({\bar{f}}_0,{\bar{f}}_1,\ldots ,\bar{f}_{k1})\), and let us abbreviate \(\delta (N,\lambda ,j)\) as \(\delta _j\). Proposition 5 yields the following.
Proposition 6
Proof
By iteratively applying (26),
(29) is obtained in the same way. \(\square\)
Let \(\Lambda _j:= \{j,j+1,\ldots , {\bar{f}}_{j1}\}\) and \(\Lambda '_j:= \{{\bar{f}}_{j1}+1,{\bar{f}}_{j1}+2,\ldots \}\).
Proposition 7
Assume that events \(\Delta _n=1\) occur at \(n=n_1,n_2,\dots\), where \(0<n_1<n_{2}<\cdots\).
\(B_{\tau }=k\) is equivalent to
Proof
Assume that (30) does not hold. This happens when at least one of \(n_j\), \(1\le j \le k\), satisfies \(n_j \notin \Lambda _j\) or when \(n_{k+1} \notin \Lambda '_{k+1}\). \(n_j \notin \Lambda _j\) implies \(n_j > {\bar{f}}_{j1}\) because \(n_j \ge j\) due to \(0<n_1<n_{2}<\cdots\). If \(n_j > {\bar{f}}_{j1}\), \(\tau \le \bar{f}_{j1}\); therefore, \(B_{\tau }\le j1 <k\). This result does not depend on \(n_{k+1} \in \Lambda '_{k+1}\). If \(n_i \in \Lambda _i\) for \(i=1,2,\ldots ,k\) and \(n_{k+1} \notin \Lambda '_{k+1}\), \(p_{\bar{f}_{i}}> 0\) for \(i=0,1,\ldots ,k\); therefore, \(B_{\tau }>k\).
Assume conversely that \(B_{\tau }\ne k\). If \(B_{\tau }=j<k\), \(\tau ={\bar{f}}_{j}\); therefore, \(n_{j+1}>{\bar{f}}_{j}\), which is inconsistent with (30). If \(B_{\tau }>k\), p must satisfy \(p_{{\bar{f}}_{k}}> 0\); accordingly, \(n_{k+1}<{\bar{f}}_{k}\), which is inconsistent with (30). \(\square\)
Proposition 7 and (7) imply that \({\bar{\Lambda }}_k=\left( \begin{array}{c}{\bar{f_k}} \\ {k\,\,\bar{f_k}}\end{array}\right)\), where
Accordingly, \(P(B_{\tau }=k)\) is exactly given as
The improved upper and lower bounds, denoted as \(B^*\) and \(B_*\), respectively, are given by
where
The upper and lower bounds in Proposition 4 correspond to the case of \(\epsilon =1\), at which \(B^*\) (\(B_*\)) takes the maximum (minimum) value. \(P(B_{\tau }=k)=B^*=B_*\) when \(\epsilon =0\). The computation time rises sharply as \(\epsilon\) decreases, so \(\epsilon\) should be decreased gradually from one.
Approximation
The calculations of \(B^*\) and \(B_*\) in (33) and (34) still require large computational capacity. This subsection approximates \(P(B_{\tau }=k)\) in (32). Because \({\bar{\Lambda }}_k=\left( \begin{array}{c}{\bar{f_k}} \\ {k\;\;\bar{f_k}}\end{array}\right)\), a part of the righthand side of (32) can be approximately given by
where \(L_1, L_2, \ldots\) are values of the function L calculated using randomly selected kdimensional coordinates \((n_1,n_2,\ldots ,n_k) \in {\bar{\Lambda }}_k\). Both sides in (37) coincide if \(m=\left( \begin{array}{c}{\bar{f_k}} \\ {k\;\;\bar{f_k}}\end{array}\right)\). The values of function L are obtained from (28) or (29).
Numerical results
This section numerically evaluates analytical results derived so far under the condition that follower sizes are constant, i.e., \(f_1=f_2=\cdots =f_k=f\).
Infinite network size
This subsection discusses the case of \(N=\infty\). Figure 2 shows \(P(B_{\tau }=k)\) in (9) when \(f \lambda =1\), where \(f \lambda =1\) represents an intermediate state between expansion and slowdown of cascade growth because \(f \lambda\) can be considered as the expected number of future retweets yielded by one retweet. The figure indicates that the tail of \(P(B_{\tau }=k)\) follows a power law \(P(B_{\tau }=k) \propto k^{1.5}\) as long as \(f \lambda =1\), regardless of the values of f and \(\lambda\).
Figure 3 shows the case when \(f \lambda \ne 1\). The tail of \(P(B_{\tau }=k)\) in this case decays faster than that in the case \(f \lambda =1\). It is easy to understand that the tail is short if \(f \lambda <1\) because the cascade size is small in this case. The tail is also short when \(f \lambda >1\) because probability \(P(B_{\tau }=\infty )\) increases. From Figs. 2 and 3, \(P(B_{\tau }=k)\) strictly decreases as k increases, regardless of \(f \lambda\), when \(N=\infty\).
Finite network size
This subsection discusses the case of \(N<\infty\). Figure 4 shows the upper and lower bounds of \(P(B_{\tau }=k)\) in (19). As shown in Fig. 4(left), both bounds are close and monotonically decreases with k. Accordingly, the distribution of \(P(B_{\tau }=k)\) has a single peak at \(k=0\) when \(f\lambda\) is small. If \(f\lambda\) rises, as shown in Fig. 4(right), the upper bound shows another peak, while the lower bound does not. The upper bound suggests bipolarization because there are peaks at \(k=0\) and \(k=59\); however, Fig. 4(right) does not prove the existence of the phenomenon because the gap between the two bounds is too wide.
Bounds \(B^*\) and \(B_*\) in (33) and (34) were obtained to reduce the gap between the upper and lower bounds in (19). Table 2 shows how much these bounds are improved compared with those in (19), which represent the case \(S^*=S_*=0\). Because the percentages in the table are all small, calculation with larger \(S^*\) and \(S_*\) values is needed. However, the calculation requires a powerful computation environment. It took several days to obtain \(B^*\) and \(B_*\) for \(S^*=S_*=10^7\) when a new desktop PC was used.
Approximation
This subsection proves the existence of bipolarization using the approximate formula in (37). Note from Sect. that bipolarization is identified if there is only one peak at \(k \ne 0\). Figure 5 shows that this identification condition holds. The two graphs in the figure are the same except for the scale of the vertical axis. As shown in the figure, the left peak becomes lower and the right peak shifts to the right as \(\lambda\) increases. This behavior agrees with the result in Oida (2021).
Discussion
This paper dealt with two cases of network size N: finite and infinite. If network size N is sufficiently greater than cascade size B, the infinite model can be used. Contrarily, if B is close to N, the effect of finiteness of N becomes dominant. Because the Twitter network is considerably large and the majority of Twitter cascades are small, the infinite model can be used for comparison with previous Twitter data analytical results.
The infinite model showed that \(B_{\tau }\) follows a power law if \(f\lambda =1\) (Fig. 2) and that even if \(f\lambda \ne 1\), the decay exponent is approximately \(1.5\) over a limited range (Fig. 3). According to Bakshy et al. (2011), in which \(1.03 \times 10^9\) tweets and \(74\times 10^6\) diffusion events were collected over the twomonth period of Sep. 13 to Nov. 15, 2009, the cascade size has a powerlaw distribution, and interestingly, the exponent is \(1.5\) over the range of \([10^1,10^3]\).
Some researchers have collected largescale cascade samples to identify the sources of largescale cascade emergence (Zhao et al. 2015; Yu et al. 2015; Li et al. 2015; Krishnan et al. 2016; Cheung et al. 2017; Bakshy et al. 2011; Cheng et al. 2014), while some attempted to extract features of large cascade samples using machine learning algorithms (Bourigault et al. 2016; Wang et al. 2017, 2018; Horawalavithana et al. 2020; Zhou et al. 2021). The outcomes of this paper imply that these approaches might bring contradictory or ambiguous consequences regarding the effects of network properties and user behavior on cascade growth due to the stochastic duality, which implies the possibility of two extreme outcomes occurring under the same conditions (i.e., cascades may accidentally live long or may die immediately).
As a next step of this work, the following prediction method is promising. According to Fig. 4, one of the distribution peaks is at \(k=0\). Thus, an appropriate small integer \(K>0\) can be selected such that the distribution of \(P(B_{\tau }=k\tau >K)\) becomes almost unimodal. The proposed method is to obtain the mean and confidence interval of the final cascade size from this unimodal distribution. This method should yield better results for larger cascades because the finitesize effect (i.e., bipolarization) becomes more pronounced as the cascade size grows.
Conclusions
To theoretically verify the existence of bipolarization, this paper derived various mathematical equations from an urn model, a model mimicking the fundamental mechanism of Twittertype information diffusion, and has revealed the followings through numerical computation:

The infinite network size assumption simplified the final cascade size distribution. The distribution was a strictly decreasing function of the cascade size. The product of the retweet rate (\(\lambda\)) and number of followers (f) determined the shape of the distribution. A power law (over a limited range) appeared if \(f\lambda =1\) (\(f\lambda \ne 1\)).

Calculation of the distributions assuming the network size is finite required a very large amount of computation power. The upper and lower bounds of the distribution showed that the distribution was a decreasing function of the cascade size if \(f\lambda\) is small. The bounds also suggested that another peak could emerge in the distribution as \(f\lambda\) grows.

The approach of using random numbers to approximate the shape of the distribution revealed the existence of another peak. The two peaks of the distribution moved apart as \(f\lambda\) increased. This result was consistent with that reported in a previous simulation work.
References
Bakshy E, Hofman JM, Mason WA, Watts DJ (2011)Everyone’s an influencer: quantifying influence on twitter. In: Proceedings of the fourth ACM international conference on web search and data mining, pp 65–74
Baldesi L, Butts CT, Markopoulou A (2018) Spectral graph forge: graph generation targeting modularity. In: IEEE INFOCOM 2018IEEE conference on computer communications, pp 1727–1735. IEEE
Bao Q, Cheung WK, Zhang Y, Liu J (2017) A componentbased diffusion model with structural diversity for social networks. IEEE Trans Cybernet 47(4):1078–1089
Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Batagelj V, Brandes U (2005) Efficient generation of large random networks. Phys Rev E 71(3):036113
Bourigault S, Lamprier S, Gallinari P (2016) Representation learning for information diffusion through social networks: an embedded cascade model. In: Proceedings of the ninth ACM international conference on web search and data mining, pp 573–582
Cheng J, Adamic L, Dow PA, Kleinberg JM, Leskovec J (2014) an cascades be predicted? In: Proceedings of the 23rd international conference on world wide web, pp 925–936. ACM
Cheung M, She J, Junus A, Cao L (2017) Prediction of virality timing using cascades in social media. ACM Trans Multimedia Comput Commun Appl (TOMM) 13(1):2
Crimaldi I, Louis PY, Minelli IG (2022) An urn model with random multiple drawing and random addition. Stochast Process Appl 147:270–299
Di Bona G, Ubaldi E, Iacopini I, Monechi B, Latora V, Loreto V (2022) Social interactions affect discovery processes. arXiv preprint arXiv:2202.05099
Dosi G, Moneta A, Stepanova E (2019) Dynamic increasing returns and innovation diffusion: bringing polya urn processes to the empirical data. Ind Innov 26(4):461–478
Forestier M, Bergier JY, Bouanan Y, Ribault J, Zacharewicz G, Vallespir B, Faucher C (2015) Generating multidimensional social network to simulate the propagation of information. In: 2015 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 1324–1331. IEEE
Gleeson JP, Onaga T, Fennell P, Cotter J, Burke R, O’Sullivan DJ (2020) Branching process descriptions of information cascades on twitter. J Complex Netw 8(6):002
Goel S, Watts DJ, Goldstein DG (2012) The structure of online diffusion networks. In: Proceedings of the 13th ACM conference on electronic commerce, pp 623–638 . ACM
Hino M, Irie Y, Hisakado M, Takahashi T, Mori S (2016) Detection of phase transition in generalized polya urn in information cascade experiment. J Phys Soc Jpn 85(3):034002
Horawalavithana S, Skvoretz J, Iamnitchi A (2020) Cascadelstm: Predicting information cascades using deep neural networks. arXiv preprint arXiv:2004.12373
Junus A, Ming C, She J, Jie Z (2015) Communityaware prediction of virality timing using big data of social cascades. In: 2015 IEEE first international conference on big data computing service and applications (BigDataService), pp 487–492 . IEEE
Krishnan S, Butler P, Tandon R, Leskovec J, Ramakrishnan N (2016) Seeing the forest for the trees: new approaches to forecasting cascades. In: Proceedings of the 8th ACM conference on web science, pp 249–258 . ACM
Leskovec J, McGlohon M, Faloutsos C, Glance N, Hurst M (2007) Patterns of cascading behavior in large blog graphs. In: Proceedings of the 2007 SIAM international conference on data mining, pp 551–556. SIAM
Li CT, Lin YJ, Yeh MY (2015) The roles of network communities in social information diffusion. In: 2015 IEEE international conference on big data (big data), pp 391–400. IEEE
Mahmoud H (2008) Pólya Urn models. Chapman and Hall/CRC, New York
Oida K (2021) Bipolarization in cascade size distributions. IEEE Access 9:72867–72880
Pemantle R (2007) A survey of random processes with reinforcement
Rafik A, Nabil L, Olfa S (2019) A generalized urn with multiple drawing and random addition. Ann Inst Stat Math 71(2):389–408
Tria F, Loreto V, Servedio VDP, Strogatz SH (2014) The dynamics of correlated novelties. Sci Rep 4(1):1–8
Wang Z, Chen C, Li W (2018) A sequential neural information diffusion model with structure attention. In: Proceedings of the 27th ACM international conference on information and knowledge management, pp 1795–1798
Wang Y, Shen H, Liu S, Gao J, Cheng X (2017) Cascade dynamics modeling with attentionbased recurrent neural network. In: IJCAI, pp 2985–2991
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘smallworld’ networks. Nature 393(6684):440
Wegrzycki K, Sankowski P, Pacuk A, Wygocki P (2017) hy do cascade sizes follow a powerlaw? In: Proceedings of the 26th international conference on world wide web, pp 569–576
Weng L, Menczer F, Ahn YY (2013) Virality prediction and community structure in social networks. Sci Rep 3:2522
Weng L, Menczer F, Ahn YY (2014) Predicting successful memes using network and community structure. In: ICWSM
Xie J, Zhang C, Wu M (2011) Modeling microblogging communication based on human dynamics. In: 2011 eighth international conference on fuzzy systems and knowledge discovery (FSKD), vol. 4, pp 2290–2294. https://doi.org/10.1109/FSKD.2011.6020045
Yu L, Cui P, Wang F, Song C, Yang S (2015) From micro to macro: uncovering and predicting information cascading process with behavioral dynamics. In: 2015 IEEE international conference on data mining, pp 559–568. IEEE
Zhao Q, Erdogdu MA, He HY, Rajaraman A, Leskovec J (2015) Seismic: a selfexciting point process model for predicting tweet popularity. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1513–1522. ACM
Zhou C, Zhao Q, Lu W (2015) Impact of repeated exposures on information spreading in social networks. PLoS ONE 10(10):0140556
Zhou C, Zhao Q, Lu W (2017) Cumulative dynamics of independent information spreading behaviour: a physical perspective. Sci Rep 7(1):1–14
Zhou F, Xu X, Trajcevski G, Zhang K (2021) A survey of information cascade analysis: models, predictions, and recent advances. ACM Comput Surv (CSUR) 54(2):1–36
Acknowledgements
The author acknowledges the support of the Fukuoka Institute of Technology for proofreading and publication costs.
Author information
Authors and Affiliations
Contributions
The author conducted this study alone. The author have read and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The author declares that the author has no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Oida, K. Information cascade final size distributions derived from urn models. Appl Netw Sci 8, 30 (2023). https://doi.org/10.1007/s41109023005547
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41109023005547
Keywords
 Information diffusion
 Online social network
 Cascade size distribution
 Bipolarization