Constant state of change: engagement inequality in temporal dynamic networks

The temporal changes in complex systems of interactions have excited the research community in recent years as they encompass understandings on their dynamics and evolution. From the collective dynamics of organizations and online communities to the spreading of information and fake news, to name a few, temporal dynamics are fundamental in the understanding of complex systems. In this work, we quantify the level of engagement in dynamic complex systems of interactions, modeled as networks. We focus on interaction networks for which the dynamics of the interactions are coupled with that of the topology, such as online messaging, forums, and emails. We define two indices to capture the temporal level of engagement: the Temporal Network (edge) Intensity index, and the Temporal Dominance Inequality index. Our surprising results are that these measures are stationary for most measured networks, regardless of vast fluctuations in the size of the networks in time. Moreover, more than 80% of weekly changes in the indices values are bounded by less than 10%. The indices are stable between the temporal evolution of a network but are different between networks, and a classifier can determine the network the temporal indices belong to with high success. We find an exception in the Enron management email exchange during the year before its disintegration, in which both indices show high volatility throughout the inspected period.


Introduction
Dynamic complex systems of interactions are often modeled as a sequence of snapshots of networks in time (Holme and Saramäki 2012).While this is a rather simplistic representation, it is widely accepted that the structural properties of a network play a significant role in determining its actors' behavior (Granovetter 1983;Burt 2000;Haynie 2001;Spencer 2003;Snijders 2005;Kossinets and Watts 2006;Perra et al. 2012).The last decade's abundance of temporal information paved the path to a further understanding of the dynamics of networks (Lazer et al. 2009;Artime et al. 2017) and the effect on their actors (Fowler et al. 2008;Phelps 2010;Hellmann and Staudigl 2014;Ilany et al. 2015).
The intensity of interactions, also referred to as ties' strength, has long been recognized as a fundamental property (Barrat et al. 2007).Human contacts are of different durations (Barabasi 2005;Onnela et al. 2007); Human relationships are of varying strength (Granovetter 1977); Human flight fluxes differ across routes (Opsahl et al. 2008), and more.The heterogeneity in edge intensity, i.e., the duration, strength, or capacity of the above interactions has been modeled utilizing edge weights and weighted networks (Barrat et al. 2007;Barrat et al. 2004;Newman 2004;Opsahl et al. 2010).The intensity of interactions (Opsahl et al. 2008) was used in a variety of applications, such as an aiding tool in the assessment of the level of conflicts within organizations (Nelson 1989), and the understanding of human communication patterns (Gilbert and Karahalios 2009;Miritello et al. 2011).
Here, we utilize weighted networks modeling to research temporal indices of engagement, such as average intensity and participation inequality in online person-to-person interaction networks, termed connection networks (Holme and Saramäki 2012).Connection networks may refer to organizational email networks, online forums and messaging apps, and online discussions (Eckmann et al. 2004;Sun et al. 2016).
Temporal measures of engagement are of interest as they give a measure of member participation, interest, influence, dominance, and more.In organizations, where frequent changes were found to be the norm (Burke 2017), following the temporal intensity and dominance of the interactions can help in identifying fluctuations in involvement and engagement prior, during, and after a planned organizational change, as well as assess the reactions to a shock.These temporal measures are of interest, also in the case of online social networks engagement, where participation was found to be dominated by a few (Nielsen 2006).Recent studies, however, found that participants change their active role in the network and their engagement over time (Sonnenbichler 2010).Currently, it is unclear whether these changes affect the temporal measures of network activity.
To study the temporal behavior of a network, we define indices of average connection intensity and nodal dominance inequality in temporal networks and measure these quantities over several real-world networks.Surprisingly, we find a stationary behavior of networks over time, regardless of massive fluctuations in their size.Our results demonstrate that networks converge to a steady-state of engagement, irrespective of significant variations in the number of participants.Deviations from the steady state are rare and do not correlate with a change in size.
Of specific interest is the case of the Enron managers email network.The dataset was released in a court order after the company has disintegrated, and has been recently used, together with the known set of events, for change point detection (CPD) schemes (Peel and Clauset 2015;Miller and Mokryn 2018).Unlike anomaly detection techniques that scan for temporal fluctuations from the norm, CPD schemes try to infer the points in time when networks change their norm and thus are termed points of change.We find that throughout the period inspected in the Enron managers email network, both indices cannot be seen as stationary, and the fluctuations in the network's temporal indices are significantly higher than the ones we found for all other networks.
Our results determine that networks differ by the engagement indices we defined, and can be differentiated by them.To further verify this result, we ran a classification experiment over the weekly indices, and find that the classifier can classify the indices tuples to their corresponding network with high validity.
Our surprising yet robust results have implications to the inference of the behavior of complex systems over time and the dynamics of networks.Of interest is the understanding of the origin of the different engagement indices between networks, and whether they can be utilized to characterize networks.The robustness of the result across size changes in the networks is of importance for the understanding of stationary properties, and their implications for dynamic systems and collective behavior.

Related
Complex systems of interacting elements, from human (social and organizational) to physical and biological ones, can be modeled as interaction networks, with nodes representing the elements and edges representing their interactions.When the interactions are dynamic, i.e., human and social interactions, a complete model that captures the longitudinal evolution of the system is comprised of a sequence of networks, each portraying a snapshot of the system at a single point in time.Other models do exist (Holme and Saramäki 2012).In this work, we follow the modeling of sequential periods similarly to (Pan and Saramäki 2011).
Temporal networks are viewed in recent years as a natural way to investigate dynamic systems (Holme and Saramäki 2012;Artime et al. 2017;Sekara et al. 2016;Li et al. 2017), where "the system under study should consist of agents that interact pairwise, so that the interactions have both some degree of randomness and some regularity" (Gautreau et al. 2009;Holme and Saramäki 2012).Dynamic online interactions have been studied to model conflicts (Yasseri et al. 2012), temporal ego networks and strength of links over time (Karsai et al. 2014).
In this work, we model the dynamics of electronic one-to-one communication such as emails and instant messages.The case of online forums can be considered as a oneto-many communication (Holme and Saramäki 2012) yet, in this work it was modeled utilizing the replies and hence also as a form of one-to-one communications.
Temporal networks of electronic messages have been investigated mainly in the context of information spreading and contagious (Rodriguez et al. 2011;Gomez-Rodriguez et al. 2012;Rosvall et al. 2014;Nadini et al. 2018).Structural dynamics and properties of temporal networks also receive much attention, such as temporal paths length, centrality, community and motif measures (Pan and Saramäki 2011;Perra et al. 2012;Kovanen et al. 2011;Taylor et al. 2017).
Complex networks of interactions are dynamic and heterogeneous by nature (Corrado 2019).One of the cornerstones of heterogeneity is the nodal degree, or in weighted networks, node intensity.Intensity patterns are heterogeneous with a few nodes having a significantly higher degree or intensity level, hence more dominant in the network (Barrat et al. 2004;Barrat et al. 2007;Opsahl et al. 2008;Corrado 2019).Dominance in systems mostly refers to the dominant role of its members.In social networks of interactions, groups of roles are inferred by analyzing the structure of networks (Gupte et al. 2017;Costa and Ortale 2018).Studies found that in online social networks the most prominent group is that of active influencers, estimated at merely 1% of the members, while accounting for almost all the network activity (Nielsen 2006).Role groups differ in size.Nielsen (2006) found that most online communities have a highly unequal role group sizes, with 90% of members never contributing, 9% that contribute little, and 1% that account for almost all network activity.Interestingly, roles are temporal and members often transition between roles (Sonnenbichler 2010).
Hence, we continue to define measures of engagement in networks, and explore their temporal nature.For a suggested organizational change, for example, such measures can determine levels of engagement in the change: If communication inequality is low, then many participate in discussions.If inequality is high, only a few dominate the conversation and are actively involved.The intensity of the conversations can be identified by comparing to the intensity in other periods.

Network intensity measures
We are interested in capturing both the average intensity of interactions, regardless of the number of different interactions, and the interactions variance in a network, which is a measure of inequality.A measure of average intensity of the edge interactions in a network differs from average nodes' strength, as the measure should not favor the number of active connections a node has.Measures of nodal strength favor nodes that have many active connections.Additionally, we suggest to measuring the inequality of nodal interaction in a network.Figure 1 illustrates two networks, each consists of three nodes and their interactions.In the examples illustrated, on the left (a) node B interacts intensively with A and C, while on the right (b), all three nodes communicate with each other at the same intensity.An estimate of a network average intensity level should account for the number of active connections in a network.In the case of Network (a) there are only two such connections, and in the case of Network (b) there are three.We devise indices that would show that in Network (a) the average intensity is four, while the nodes show high inequality, and in Network (b) the average edge intensity is two, and the nodes engage equally.To the best of our knowledge, current measures do not capture the intensity and inequality of Network (a) as described here.

Average interaction intensity
We describe here a measure for deriving a network average edge intensity level.To compute the average edge intensity in a network, We start with a measure devised for nodes in a weighted network (Opsahl et al. 2010).This measure allows considering for each node not only the number of nodes in the network it interacts with but also the intensity level of these interactions: Where alpha ∈[ 0, 1] is the tuning parameter, k i is the number of nodes the focal node i is connected to, and s i is its weighted degree, computed by: Where N is the total number of nodes in this network, and w ij is a non-zero value for the strength of edges that disseminate from the focal node i.
The tuning parameter, α, determines the importance of each of these parts.When α = 0 the edge strength is ignored, and only its existence is taken into account, resulting in a measure that is similar to the one in Freeman (1978).Conversely, when α = 1 only the edges weights are considered, while the binary structure is not (Opsahl et al. 2010).
Taking a network-wide approach, we continue and define the weighted sum of the node degrees given the tuning parameter α as follows: φ is a metric that depending on the chosen value for the tuning parameter α describes with a scalar the weighted sum of the network degrees.Specifically, when the tuning parameter α is set to zero the metric φ α=0 corresponds to the number of edges in the graph; Alternatively, when the tuning parameter α is set to one the metric φ α=1 corresponds to the sum of all edge weights in the network, that is, the overall intensity of interactions in a network.
We then propose a level of intensity index for networks that is the ratio between the overall intensity of edge interactions in the network and the binary number of edges.We formally propose the following index: ψ ≥ 1 holds for all graphs.In the case where edge weights are based on a ratio scale (Opsahl and Panzarasa 2009) then ψ is bounded by that ratio.Otherwise, it is unbounded.When ψ ∼ 1 the network intensity level is very low, and the vast majority of edges have a low weight.In social networks of interactions, low intensity corresponds to a low number of interactions between any two members in the network.Accordingly, when ψ >> 1, the network intensity level is high.High intensity, in this case, implies the existence of edges representing interactions of high volume, also referred to as strong ties.
In this work, we did not take the direction of the interactions into account, yet clearly, the intensity index can be computed for in-degrees and out-degrees separately.In organizations, for example, it corresponds to those disseminating information and those on the receiving side; in online forums to conversation initiators and responders, correspondingly.

Temporal network intensity index
Intensity level can be computed over the entire timeline of a network.To understand how the intensity changes with time or in response to events a temporal definition is needed.We continue then to propose a temporal index of intensity.Formally, we propose a measure of Temporal Network Intensity as follows: Where G τ , τ ∈[ 1..T] is a sequence of graphs representing consecutive network snapshots in a period T.
Interactions indicate how information flows in a network.Understanding the flow of information in a network over time is fundamental in the research of social networks and organizations.The proposed temporal intensity metric enables an additional layer of knowledge on the flow of information, as it gives a measure of volume.It captures interactions occurring during a measured period that do not change the structure but still carry additional information on the complex system behavior.It thus enriches our understanding of the network's temporal complexity.For example, today's organizations are in a constant state of change (Burke 2017).Following the temporal intensity of the interactions in an organization can help in identifying fluctuations in the level of intensity in the organization prior, during, and after a planned organizational change.

Network dominance inequality index
Complex networks are heterogeneous with a few dominant nodes.We explore here the measure and extent of this inequality.Measuring the disparity in the level of communication, for example, enables an understanding of the variance in the level of members' engagement in a network.
We continue to study the inequality in nodal dominance in a graph while considering the intensity of nodes' interactions.In organizations, for example, when a change is introduced, high interactions can be found among its supporters and opposers.Members that have yet to make up their mind would exhibit less intensity in their interactions (Burke 2017).In this case, understanding the level of inequality in the intensity of the participation can aid in understanding the balance between change-involved members versus those who are not.
We measure the inequality in nodal interactions dominance utilizing the Gini inequality index (Gini 1921;Atkinson 1970) for measuring income inequality.The Gini index is a measure of the mean absolute difference, and in our case, the difference is in nodal engagement, i.e., weighted degree.To follow the temporal changes of dominance in a network, we use a temporal measure of this index per period, which we term Dominance Inequality.

Measuring temporal intensity and dominance inequality in real networks
To measure our temporal indices in real networks we collect six different datasets of real networks of interactions.We concentrate on contact networks, i.e., organizational emails, online forums, and messaging applications, as detailed in Table 1.To capture the evolutional dynamics of the longitudinal evolution of the system we follow the modeling of temporal sequential periods similarly to (Pan and Saramäki 2011) and divide the temporal information to a sequence of networks, each portraying a snapshot of the system during a week.

Robustness of the temporal network intensity
For each of the datasets described in Table 1 1 we calculate the weekly temporal network intensity, as defined in Eq. 5.In Fig. 2 the x-axis denotes the timeline, which is different for each network.The blue dots correspond to the calculated temporal network intensity, and their values are denoted by the y-axis.The network size, as measured by the number of weekly nodes, is denoted by light grey, and its scale is denoted by the right y-axis.Surprisingly, the networks exhibit a rather stable temporal behavior in their intensity, regardless of the fluctuations in size.The Facebook network, on the lower left panel, shows a steady increase in network size from several hundreds up to more than 10000 weekly participants.Still, the average temporal intensity is quite robust.On the upper left panel we see the temporal intensity of the AskUbuntu forum over time.The average intensity hardly changes in time, despite large fluctuations in the number of participants in the discussions over the different periods.Similar results are seen for the Wikipedi-aConflict dataset, in the left middle panel.Interestingly, in the Wikipedia talk network, on the upper right panel, we see that weeks that have sparked in the number of participants are somewhat less intense, on average.It is interesting that despite the spike in general interest during these weeks, the average intensity of conversation has not increased, and is even lower.It is also interesting to note that although the Intensity is not bounded in value, in all these networks the average intensity is low (minimal intensity is calculated from zero as explained above).
To understand the exact nature of the temporal network intensity in networks we continue to plot the PDF of the Temporal Network Intensity as described in Eq. 5, i.e., with a minimal intensity of 1, for each network, as denoted in Fig. 3a.The measured networks have a vivid normal distribution of intensity level over the consecutive weeks of activity.We continue to understand the average temporal change between consecutive weeks by computing the percentage of change between every two consecutive weeks.Figure 3b denotes the cumulative distribution of the relative change in the measured Temporal Network Intensity between every two consecutive weeks for each dataset.For both AskUbuntu and Facebook less than 0.05 change in the temporal activity accounts for more than 90% of the consecutive weeks.The network of EU emails is also almost as stable.In all networks, however, more than 80% of the changes are of less than 15%.

Robustness of temporal dominance inequality
Similarly to the Temporal Network Intensity we plot in Fig. 4 the values for the Temporal Dominance inequality for the networks.First, it is interesting to note that the measured  (Erdös and Rényi 1960) would yield very low inequality values, as all nodes have a similar chance for communicating, and a pure Preferential Attachment (PA) (Barabási and Albert 1999) network would give a very high inequality value.As the measured networks are known to be heterogeneous, we expect the inequality to be rather high.The somewhat low value of inequality can be attributed to the fact that we measure the undirected graph.Indeed, when measured over a directed graph the inequality results were higher.More importantly, though, is that also for this metric, temporal results are mostly stationary for each network, and differ between the networks.Fig. 5

A network nearing its end: enron emails
An important question is how would the devised metrics behave for the Enron dataset.We deviate here for a paragraph, to give the needed background on the once billiondollar company known for its Bankruptcy in December of 2001 and its disintegration in the following year.Enron, originally a gas company, has "created Enron Online (EOL) in October 1999, an electronic trading website that focused on commodities.Enron was the counterparty to every transaction on EOL; it was either the buyer or the seller..When the recession hit in 2000, Enron had significant exposure to the most volatile parts of the market.By the fall of 2000, Enron was starting to crumble under its own weight" (Segal 2019).Shortly after its demise, the company's entire email exchange was released by a judge order.Given the known set of events and their timeline, and the availability of the entire management email corpus, Enron's emails are used for change point detection algorithms, who compare their found events with actual ones (Peel and Clauset 2015;Miller and Mokryn 2018).
We checked our indices over the Enron management emails, on a weekly basis, from August 2000 to March 2002.Our intriguing results, presented in Fig. 6a show that the Enron network is different from the other networks examined in terms of the range of Temporal Network Intensity index and the percentage of changes measured in the index.The network displays Temporal Network Intensity in the range of 3.0 − 12.0, well above the index range for the other networks.In addition, the index volatility is very high and the changes between weekly measurements are big.The Temporal Dominance Inequality, as presented in Fig. 6b, while is similar in range to that of other networks, also shows high volatility compared to the other networks.
Overall, during the entire checked period both the Temporal Networks Intensity and the Dominance Inequality indices exhibit large weekly changes that unlike the rest of the networks, cannot be defined as stationary.

Predicting a network from its engagement
The networks examined were characterized by stability in two selected indices, the activity index and the Gini index.This stability comes both in the range of the values measured for each network over time and in the level of changes within the indices between successive periods.In measuring the distribution of the percentage of changes between successive periods it appears that volatility of up to 0.5 in the Temporal Network Intensity covers over 90% of the network's operating time.Similar results were obtained for the Dominance Inequality index.The values measured for the indices between networks, however, differ by 0.4 to 0.7.
To examine how typical are the Temporal Network Intensity and the Temporal Dominance Inequality indices for each network we perform a classification task over the temporal indices with the target of classifying the class (dataset) that produced it.We perform the experiment over all seven datasets as appear in Table 1.
Classification methodology: Our target function is to classify to seven different classes, each corresponds to a network dataset.Hence, we decided to avoid the binary Fig. 6 The cumulative distribution of the relative change between every two consecutive weeks for each dataset as measured for (a) Temporal Network Intensity and (b) Dominance Inequality classifiers, like support vector machine, which will enforce additional algorithms like "one-vs-one" or "one-vs-all" to compare its classification efficiency and calculate the overall confusion matrix.Also, as our features have only two dimensions (Inequality, Activity) we skipped classifiers that focus on dimension reduction, such as neural networks.We therefore chose three multiclass classifiers.All are well known, robust, yet simple classifiers.The first is the K-Nearest Neighbors (KNN) classifier.Several wellknown algorithms are implementing KNN such as Brute force, K-D tree, and more.We utilized (Muja and Lowe 2009) to automate the algorithm configuration.In addition, we also ran a Decision Tree (Breiman 2017) and Random Forests (Breiman 2001).We ran the classification using Python Scikit-learn (Pedregosa et al. 2011) with five folds crossvalidation (Kohavi and et al. 1995) and calculated the Precision, Recall, Accuracy, and F1.We summarize the results in Table 2.
Our classes (datasets) were not equal in size, when considering the number of weeks (see Table 1).We, therefore, employ two known balancing techniques.The first is multiplying the small datasets to balance the scale of each class; the other is Stratified Folds that preserves the probability distribution of each class for all folds (Kohavi and et al. 1995).
Classification Results: We present the results for each classification algorithm and each balancing method in Table 2.
All algorithms were able to infer with F1 in the range of [ 0.75, 0.85] and high accuracy the correct network dataset from its weekly indices over all folds.To test the dependency of the success per class, we repeatedly re-ran the tests while excluding one class (dataset) at a time, and compared the overall results.The difference in the results was insignificant across all experiments, showing that the overall result is robust across the datasets.
Figure 7 visualizes the weekly indices for each dataset while coloring each dataset differently over a planar space.The visualization indicates a limited center of mass for most networks.The Enron dataset again shows a wide variety in the indices between the weeks, and is typically much more intense than the rest of the networks.

Discussion and conclusions
In this work, we set to understand how temporal engagement in networks changes with time.To that end, we defined two indices to capture the temporal network activity.The first, Temporal Network Intensity, can be roughly described as the average edge intensity in the network over a period.The second, the Dominance Inequality, is a measure of the engagement variance.Our surprising results are that for most emails and forum networks checked, the indices were stationary, implying a  steady state.For a network known to be nearing a disintegration, Enron, the indices were volatile.A similar stationary value was found in Gautreau et al. (2009) for the average degree of the flux of people from airports.However, airports' physical limitations may give a plausible explanation for this measure.In the datasets examined in this work these limitations do not exist.Interestingly, both our indices can be derived utilizing the average degree.We believe that these findings need to be further researched over a wider variety of networks exhibiting different dynamics.
The robustness of the indices regardless of significant size changes of the underlying network in time, is itself intriguing.For example, when the size of the network decreases, in a process of preferential detachment it is expected that the level of engagement and hence the indices would be also effected.We intend to further research this counterintuitive result.
We focus here on the complex temporal interactions and utilize them to gain an understanding on the system's temporal behavior.By moving from a nodal-centric view to an interaction-centric view, we suggest a novel understanding on the dynamics of complex networks.Lastly, our result show that the indices we devised fluctuated significantly in a network that was dealing with a shaky situation that let to the company's disintegration.In a future research, we intend to further understand the behavior of the indices for different network models and dynamics.

Fig. 1
Fig. 1 Two networks with three nodes and weighted edges.The width of the edges corresponds to their weights.The panels present two different interaction patterns and edge intensities

Fig. 2
Fig. 2 Temporal average intensity for the six different datasets, denoted by the strong blue line.The values are denoted by the y-axis to the left.The light grey dashed line denotes the number of participating nodes during each period, i.e., the temporal size of the network.The size scale (i.e., number of nodes) is given by the right y-axis (a) denotes the Temporal Network Dominance probability distribution, and Fig. 5(b) denotes the cumulative distribution of the changes in the measured Temporal Network Dominance between every two consecutive weeks, for each dataset.Almost in all datasets the change in the dominance inequality between the weeks is very low.For

Fig. 3
Fig. 3 Aggregated network measures: a The PDF of the weekly Temporal Network Intensity for each dataset.b The cumulative distribution of the relative change in the measured Temporal Network Intensity between every two consecutive weeks for each dataset

Fig. 4
Fig. 4 Temporal Dominance Inequality for the six different datasets, denoted by the strong blue line.The values are denoted by the y-axis to the left.The light grey dashed line denotes the number of participating nodes during each period, i.e., the temporal size of the network.The size scale (i.e., number of nodes) is given by the right y-axis

Fig. 5
Fig. 5 Aggregated network measures: a The PDF of the weekly Temporal Dominance Inequality for each dataset.b The cumulative distribution of the relative change in the measured Temporal Dominance Inequality between every two consecutive weeks for each dataset

Fig. 7
Fig. 7 Planar view of the weekly measured metrics, Activity Intensity (axis x) and Dominance Inequality (axis y) for all datasets

Table 1
Real-World datasets descriptions

Table 2
Classification results according to the Intensity and Inequality features