Skip to main content

Deterministic random walk model in NetLogo and the identification of asymmetric saturation time in random graph

Abstract

Interactive programming environments are powerful tools for promoting innovative network thinking, teaching science of complexity, and exploring emergent phenomena. This paper reports on our recent development of the deterministic random walk model in NetLogo, a leading platform for computational thinking, eco-system thinking, and multi-agent cross-platform programming environment. The deterministic random walk is foundational to modeling dynamical processes on complex networks. Inspired by the temporal visualizations offered in NetLogo, we investigated the relationship between network topology and diffusion saturation time for the deterministic random walk model. Our analysis uncovers that in Erdős–Rényi graphs, the saturation time exhibits an asymmetric pattern with a considerable probability of occurrence. This behavior occurs when the hubs, defined as nodes with relatively higher number of connections, emerge in Erdős–Rényi graphs. Yet, our analysis yields that the hubs in Barabási–Albert model stabilize the the convergence time of the deterministic random walk model. These findings strongly suggest that depending on the dynamical process running on complex networks, complementing characteristics other than the degree need to be taken into account for considering a node as a hub. We have made our development open-source, available to the public at no cost at https://github.com/bravandi/NetLogo-Dynamical-Processes.

Introduction

Study of complex systems and the desire to understand and uncover complex phenomena dates back to 18th century when Sir Isaac Newton formulated the famous three-body problem (Newton 1687). Ever since, this field has evolved into the realm of complex network science, as an emerging research field that explains the foundational necessities for describing and understanding highly heterogeneous and interconnected complex systems (Strogatz 2001).

Complex networks are often distinguished by exhibiting emergent phenomena that cannot be solely described by the sum of the behavior of their constituent subsystems/subnetworks (Mitchell 2009). Extensive research effort within this realm over the past two decades has contributed to the development of knowledge and theories that shed light on the understanding of complex dynamical processes like the spread of epidemics (Keeling and Eames 2005; Newman 2002; Pastor-Satorras et al. 2015), the extent of controllability in networked systems (Liu et al. 2011; Heydari et al. 2019), synchronization in complex power networks (Sajadi et al. 2022; Motter et al. 2013; Dorfler and Bullo 2012), and the pollination patterns in ecology (Olesen et al. 2007; Olesen and Jordano 2002), just to name a few. Nonetheless, the education of network thinking and applications of complex networks to a broad range of engineering and science disciplines has remained an open challenge because of the inherent sophistication associated with this subject and its foundational analytical complexity which many may find to be acute. In this paper, we address this topic by borrowing the interactive element and temporal nature of NetLogo (Tisue and Wilensky 2004) to promote network thinking in both teaching and research. We believe enhancing the ability to communicate and model complexity is an essence to develop the next generation of interdisciplinary thinkers, necessary for tackling the grand challenges of our century, particularly that the unprecedented connectivity around us continues to escalate the complexity of all disciplines.

Amongst all aspects of complex networks, random walk theory remains one of the most important developments (Rosvall and Bergstrom 2008; Noh and Rieger 2004). Random walk is a fundamental model for explaining the observable behavior of many stochastic processes, with its application spanning over various scientific fields from ecology and psychology to physics and chemistry (Burioni and Cassi 2005; Masuda et al. 2017). The saturation time on random walk is quantified by the number of steps from the initiation to the final convergence of the walk, and it has been mainly studied in the context of information spread over networks (Jia et al. 2010; Zhao et al. 2011; Milli et al. 2018; Elmaghraby and Aggerwal 1997). To the best of our knowledge, this is the first study on the relationship between network topology and saturation time of random walk processes.

The contributions of this work are threefold. First, we report on our recent contribution to NetLogo by developing the deterministic random walk model which is very practical for education of network science fundamentals. NetLogo is a leading platform for complex computational analytics, eco-system modeling, and multi-agent cross-platform programming (Tisue and Wilensky 2004). Second, our development uncovers that in Erdős–Rényi (ER) graphs (Erdös and Rényi 1959) the saturation time of the deterministic random walk exhibits an asymmetric pattern with a considerable probability of occurrence, highlighting the relationship between network structure and expected outcome of dynamical processes. Third, leveraging our development, we propose that the identification of hubs in networks may not always be based on having a large degree, depending on the dynamical processes running over a network. Identifying hubs only based on degree is commonly used to analyze complex networks. Our analysis demonstrates that the inter-connectivity between the hubs influences the saturation time in deterministic random walk. Hubs in the ER graphs may emerge and reduce the saturation time in deterministic random walk, though these hubs are not as efficient as the hubs in the Barabási–Albert (BA) model (Barabási and Albert 1999) in stabilizing the saturation time.

This work has a broader impact beyond the random walk analysis as it highlights the importance of employing interactive environments for promoting innovative thinking. It has direct application where exploration and understanding of the interrelated yet immensely complex dynamics and underlying phenomena remain an existing challenge, across several disciplines involving complex networks, such as electric power grids, telecommunication infrastructure, traffic networks, and interconnected vehicles.

Methods

Deterministic random walk in NetLogo

The implementation of random walk processes on graph can be divided into two categories: deterministic and stochastic. The deterministic analysis allows the identification of baselines such as the stationary states in epidemic outbreaks (Barrat et al. 2008), while the stochastic analysis provides us with diverse results that are generated from the same initial setting but are different due to the dynamic stochastic process. Under the same setting, even though the randomness of the stochastic random walk fluctuates the characteristics of dynamical processes like the distribution of the number of walks over nodes or the overall saturation time, the stochastic results will only fluctuate around the deterministic outcome. Here, we restrict our attention on the deterministic model of random walk. In the deterministic random walk model, partial walkers on a node leave the initial node, and all these leaving walkers equally distribute to all neighbors of this node (Fig. 1).

Fig. 1
figure 1

Deterministic random walk. p is the diffusion rate of the walkers. The blue node contains n walkers and has d neighbors at a certain instance of simulation. In this model, \(n\times p\) walkers leave the blue node on an average and all its (green nodes) neighbors get \(\frac{n\times p}{d}\) walkers from this node. In the meantime, some walkers are coming to the blue node from its (green nodes) neighbors using the same logic

The deterministic random walk over a network is realized after the following initialization: m nodes are randomly picked and n walkers are uniformly distributed in those m nodes. After the initialization, walkers start to walk. In each time point t, walkers follow the rules in Fig. 1 to move. Therefore, the number of walkers in node i in time t is, \(n_i(t)\), updated by:

$$\begin{aligned} n_i(t) = n_i (t-1) \times (1-p)+\sum _{j}A_{ij} \times \frac{p \times n_j (t-1)}{\sum _lA_{jl}} \end{aligned}$$
(1)

where A is the adjacent matrix of the network and p is the diffusion rate. As the random walk process progresses, \(n_i(t)\) constantly approaches to a value, that is:

$$\begin{aligned} n_i(t) \rightarrow k_i \times \frac{n}{m} \end{aligned}$$
(2)

where \(k_i\) is the degree of the ith node. Only at an equilibrium we observe a linear relation between the degree of nodes and the number of walkers, because \(\frac{n}{m}\) is a constant (see Fig. 2). Thus, for each time step, t, one can run a linear regression on the number of walkers of nodes to check whether or not the random walk process has arrived at an equilibrium. We set the necessary condition for arriving at an equilibrium to be \(R^2(t) >0.99\), marking the saturation time when the processes reaches end. Thus, at the end of the realization, the number of the final walkers on a node depends only on its degree.

Next, we explain the implementation of these fundamental principles in NetLogo.

Fig. 2
figure 2

Saturation time in random walk. A, C, E Circular layout in NetLogo frontend at the end of the random walk process on a BA graph with 180 nodes and a total of 600 walkers. Tunable parameters include type of network, number of nodes, average degree, layout type, total number of walkers, and initial number of nodes on which walkers are distributed randomly. B, D, F As the process saturates, the average walkers vs. degree plot slowly becomes a straight line

Implementation in NetLogo

Here we explain the steps and assumptions involving the back-end and front-end implementation of deterministic random walk model in NetLogo. Interactive front-end helps the users visualize the process in real-time, an important factor in teaching complex networks. Our back-end development is available to the public in the form of an open-source contribution to NetLogo. It presents various features convenient to users and suitable for both educational and research purposes, including the flexibility of tuning multiple parameters that influence the random walk process.

Initializing the network

First step is to select the network-model. This can be either ER or BA (scale-free) graphs. Next, the number of nodes in the network and the average degree is to select. Our model uses NetLogo algorithms to generate an instance of the input graph. The graph layout can be either a spring layout or a circular degree sorted layout. Coloring the nodes based on degree helps in an easy-to-follow visualization. We have defined two schemes for coloring the nodes. First, the degree single gradient identifies the nodes from a lighter shade to a deeper one based on increasing degree. Second, the degree multi-bin gradient scheme divides the nodes into different degree bins and assigns different colors to them. We believe this scheme is helpful to visualize the intermediate and the end states of the random walk process. As the random walk moves towards saturation, more walkers start moving towards the nodes with higher degrees. This means the higher degree nodes grow, hence, the size of the nodes are proportional to the number of walkers residing on them. Other tunable parameters include the diffusion probability (the rate at which a walker moves from one node to another), the total number of walkers, and the number of nodes on which the walkers are initially distributed. The degree distribution plot is either a Poisson distribution (for ER graphs) or a scale-free distribution (for BA graphs).

Fig. 3
figure 3

NetLogo front-end for simulating random walk. Left panel includes the tunable parameters. After setting these values, we run setup and the initial state of the process in created. While running the simulation, we observe the real-time behavior of the walkers both on the network and the plots on the right panel. The degree distribution is presented in the plot at the bottom left, the number of walkers in each degree quartile is visualized in the top right plot. Lastly the relationship between the average number of walkers on nodes for each degree is shown in the bottom right plot

Front-end implementation and interactive plots

Our front-end implementation in NetLogo allows visualization of the network dynamics and the random walk processes in real-time. Figure 3 shows a snapshot of our front-end implementation. As more walkers enter a node, the sizes of the nodes grow proportionally. We divide the nodes in four degree quartiles; therefore, the number of average walkers is displayed on each degree quartile. We observe that as the simulation progresses, the walkers move from the lower degree quartiles to the higher ones. Because of this reason, the average number of walkers and the degree shows a linear relationship, depicting as a straight line (Fig. 3 bottom-right plot). Our model computes the \(R^2\) value of this straight line over time and when this value reaches 0.99 threshold, it can be reasonably concluded that the process has saturated, i.e., the number of walkers on each node has reached an equilibrium.

Results

This section presents the results from the symmetry analysis of saturation time of the deterministic in graphs, a problem that has not been studied nor reported in the literature to the best of our knowledge. Our analysis is carried out using our random walk implementation in NetLogo and aims to exhibit the scientific application of our development.

We considered two case studies. First, we present the summary of our observations from NetLogo simulation on ER and BA graphs. In this case, we noticed an asymmetry in the saturation time of random walk on ER graphs. Accordingly, we establish there could exist instances in simulation where the process saturates sooner than the others in ER networks. In contrast, saturation time for BA graphs is monotonous. Second, we explore the frequency of this phenomenon in ER graphs, by we generating multiple ER graph instances and repeating the experiment. In this case, we observed that the occurrence of this asymmetric nature of saturation time is significantly frequent in ER graphs.

Asymmetric saturation time in ER graphs

This section analyzes the asymmetric saturation time of random walk in ER graphs. To this end, we generated an ER and a scale-free graph in NetLogo, running random walk simulations, and analyzing the observations. Both the ER and BA graphs had 1,000 nodes and an average degree of 6. Then, we uniformly distributed 400 walkers over 8 randomly selected nodes. We recorded the saturation time over 100 simulations and obtained a saturation time from each of the simulations. Using these 100 data-points, we computed the median saturation time, the 95th percentile and the 5th percentile values.

Throughout this case study, we observed a wider upper bound in case of ER graphs, as presented in Fig. 4A, whereas, in BA graphs, the saturation times are symmetrically distributed across the median value. We note that this observation is independent of the diffusion probably p (Fig. 4A). Also, the asymmetric saturation time remained even in ER graphs at high p values. This observation suggests a heterogeneous saturation time in the ER model such that some simulations over ER graphs take much lower time to saturate compared to others. Thus, we expect a wider right-tail of the saturation time distribution.

Next, we systematically evaluated the consistency of this asymmetric behavior in ER graphs that we initially identified using the NetLogo environment.

Fig. 4
figure 4

Observing asymmetric saturation time in ER graphs. A Initially, we observed a wider upper bound of saturation time over 100 simulations in case of an ER and a BA graph (both with 1000 nodes and 6 average degree). This observation is consistent over different diffusion probabilities. B To quantify the level of occurrence of this asymmetric behaviour, we studied the skewness of the distribution of saturation time for the ER and BA models. Considering a high skewness threshold of 1.0, which is the upper-bound of skewness for BA graphs, we observed that the asymmetric saturation time occurred in 58% of the ER graphs. The distributions are generated through simulations over 100 instances of ER and BA graphs (each with 1000 nodes and average degree of 6) by distributing 10,000 walkers evenly on 4 randomly selected nodes and repeating this process 100 times for each graph

Consistency of observing asymmetric saturation time in ER graphs

This sections explores the generalizability of our observation on the emergence of ER graphs with asymmetric saturation time in deterministic random walks. To this end, we generated 100 instances of ER and BA graphs. Each graph contained 1000 nodes and had an average degree of 6. Then, we distributed 10,000 walkers randomly on 4 nodes and ran 100 such simulations on each of them to study the saturation time. For each distribution of the saturation time over a network, we calculated its skewness to measure the asymmetric symptom. Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean (Eberl and Klar 2020). For a unimodal distribution, negative skew indicates that the tail is on the left side of the distribution, and positive skew indicates that the tail is on the right.

The results presented in Fig. 4B indicate that the distribution of skewness in ER graphs has a longer positive tail. This positive skewness of the distribution suggests that on an average ER graphs could take longer to saturate than BA graphs. Considering a high skewness threshold of 1.0, which is the upper-bound of the skewness for BA graphs, we observed that the asymmetric saturation time occurred in 58% of the ER graphs. Thus, our results suggest that the asymmetric nature of saturation time in random walk is plausibly related to the structural properties of the graphs.

We will further explore the relationship between the saturation time and graph topology in the Discussion section.

Network topology and asymmetric saturation time

This section explores the root cause of this asymmetric behaviour and uncover whether it is a byproduct of the network size or average degree. To this end, we conducted simulations with the same setting but varying \(\langle k\rangle\) and N in ER graphs. Based on the results shown in Fig. 5, we conclude that the asymmetric saturation time is independent of the values of \(\langle k\rangle\) and p, and our rationale is the following. The average shortest path distance or small-worldness changes as \(\bar{d} \sim O \left(\frac{\log {n}}{\log \log {n}}\right)\) in BA graphs, where n is the number of nodes (Chen et al. 2008). On the other hand, the average shortest path distance grows faster in ER graphs with the number of nodes, following \(\bar{d} \sim O(\log n)\) (Fronczak et al. 2004). Thus, with the increased number of nodes, ER graphs lose their small-worldness, and lack shortcuts between different compartments of a network. With reduced small-worldness, the random walkers take a larger time to saturate, bolstering the observations on asymmetric saturation time in Fig. 5. We suggest that the observed asymmetric phenomenon has roots in the structure of ER graphs.

Next, we will discuss this relationship to provide further insights.

Fig. 5
figure 5

Consistency of asymmetric saturation time in ER graphs with varying size and density. A We observe that as the graph becomes denser, less fraction of graph instances incur the asymmetric behavior in saturation time (saturation time distribution having skewness \(\ge 1.0\)). However, we consistently find instances of ER network with the observed asymmetric behaviour even in high average degree regime. For each average degree, We generated 100 ER graph instances with 1,000 nodes and ran 100 realizations of deterministic random walks to obtain the skewness of the saturation time. B Along with changing the average degree, we now vary the number of nodes. We observe consistent behavior regarding asymmetric saturation time across different numbers of nodes

Discussion

Fig. 6
figure 6

Hubs and asymmetric saturation time. A ER graphs have less inter-hub links, hence less shortcuts inside the network. B Due to assortative mixing (hubs are linked), in BA graphs the hubs are more connected among themselves. This creates many shortcuts for the walkers within different parts of the network that provide a stable saturation time for dynamical processes. C In ER graph the number of hubs (determined by the condition \(deg(v)^2 > \sum _{j \in v.neighbors}{deg(j)}\)) negatively correlate with the saturation time, meaning that having more hubs tend to decrease the saturation time. But, having more hubs increases the instability as it positively correlates with the skewness of the saturation time, hence having asymmetric behavior. D In contrast, in BA graphs we observe having more hubs is correlated with more stability in the saturation time (negative correlation with skewness), at the same time having more hubs increases the saturation time as it takes walkers more time to find nodes with very high degree. The correlation coefficients in (C, D) are obtained by generating 100 different ER and BA networks with 1000 nodes and 6 average degrees. Then, we ran 100 simulations (number of initial selected nodes is 4 and diffusion rate is 0.4) over each network thus, for each network, we found the skewness of the duration time distribution of these 100 simulations

Thus far, we have demonstrated the educational and scientific application of our NetLogo implementation. We wish to further explore how this development could be leveraged to find answers to some of the open questions, extrapolate information, and draw inferences. To understand the possible underlying mechanisms that may cause the observed asymmetric phenomenon, we looked into the relationship between the degree distribution and the observed phenomenon, though the scientific capabilities of our development is not restricted only to the problem discussed here.

In BA graphs, the hubs are more connected among themselves due to assortative mixing. This creates many shortcuts among different parts of the network, decreasing the saturation time of dynamical processes (Pastor-Satorras et al. 2015; Keeling and Eames 2005). These additional inter-hub links in BA graphs enable walkers to find high-degree nodes robustly (Fig. 6A, B). This robust convergence of deterministic random walk can be explained by investigating the relationship between the number of hubs, median saturation time, and the observed asymmetric behaviour (skewness of saturation time distribution). For demonstration purposes, we looked into these relationships by counting the number of hub-nodes in a network using the following condition: \(deg(v)^2 > \sum _{j \in v.neighbors}{deg(j)}\), though we recognize a thorough analysis can be conducted as future work on these relationships.

Figure 6C, D present the correlation between the number of hubs and the median and skewness of saturation time for BA and ER graphs. Surprisingly, we find that the Pearson’s correlation coefficient between the number of hubs and median saturation time is 0.037 for ER graphs, but in BA it is 0.630, indicating that the BA hubs are more effective compared to ER in navigating the walkers to find high degree nodes. In the BA model, having more hubs tends to increase the median saturation time, indicating that the walkers may need more time to find large-degree hubs to reach equilibrium (i.e., \(R^2(t)>0.99\)). In the case of ER graphs, the higher positive association between the number of hubs and skewness of saturation time signals that the structure of some ER graphs is less robust against outliers, being more susceptible to the initial selection of seeds that could trap walkers in the network’s peripheral nodes and resulting the observed skewed distribution of saturation time.

Conclusions

The interactive element and the temporal nature of NetLogo intrigued us to investigate the trajectory of saturation time in the deterministic random walk model in ER and BA models, leading us to find the asymmetric saturation behavior in ER graphs. Therefore, we decided to promote network thinking in both teaching and research, we contributed the deterministic random walk model to NetLogo. This is the first model of dynamical processes in NetLogo over networks, and we envision that this work encourages more contributions related to dynamical processes to NetLogo in the future, like the stochastic random walk model (Barrat et al. 2008).

We believe that creating new toolkit with increased capabilities to model and contextualize complexity is essential for a successful development of the next generation of interdisciplinary thinkers, engineers, and scholars, a prerequisite for tackling the grand challenges of our century.

Code availability

NetLogo and all the codes for reproducing the results presented in this paper are available open-source on Github via https://github.com/bravandi/NetLogo-Dynamical-Processes.

References

Download references

Acknowledgements

We thank Professor A. Vespignani and Dr. J. Davis of Network Science Institute at Northeastern University for their insightful discussions and comments.

Funding

The authors have not received any funding for this project.

Author information

Authors and Affiliations

Authors

Contributions

AC contributed to the implementation of random walk in NetLogo, performing simulations in Python, creating figures and writing the manuscript. QC contributed to the implementation of random walk in NetLogo, performing simulations in Python, creating figures and writing the manuscript. AS contributed to the problem statement and writing the manuscript. BR contributed to designing of the problem statement, implementation of random walk in NetLogo, performing simulations, creating figures and writing the manuscript.

Corresponding author

Correspondence to Babak Ravandi.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chatterjee, A., Cao, Q., Sajadi, A. et al. Deterministic random walk model in NetLogo and the identification of asymmetric saturation time in random graph. Appl Netw Sci 8, 33 (2023). https://doi.org/10.1007/s41109-023-00559-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41109-023-00559-2

Keywords