Skip to main content

Characterizing reticulation in online social networks during disasters

Abstract

Online social network has become a new form of infrastructure for communities in spreading situational information in disasters. Developing effective interventions to improve the network performance of information diffusion is essential for people to rapidly retrieve information in coping with disasters and subsequent disruptions. Existing studies have investigated multiple aspects of online social networks in stationary situations and a separate manner. However, the networks are dynamic and different properties of the networks are co-related in the evolving disaster situations. In particular, disaster events motivate people to communicate online, create and reinforce their connections, and lead to a dynamic reticulation of the online social networks. To understand the relationship among these elements, we proposed an Online Network Reticulation (ONR) framework to examine four modalities (i.e., enactment, activation, reticulation, and network performance) in the evolution of online social networks to analyze the interplays among disruptive events in disasters, user activities, and information diffusion performance on social media. Accordingly, we examine the temporal changes in four elements for characterization of reticulation: activity timing, activity types (post, share, reply), reticulation mechanism (creation of new links versus reinforcement of existing links), and structure of communication instances (self-loop, converging, and reciprocal). Finally, the aggregated effects of network reticulation, using attributed network-embedding approach, are examined in the average latent distance among users as a measure of network performance for information propagation. The application of the proposed framework is demonstrated in a study of network reticulation on Twitter for a built environment disruption event during 2017 Hurricane Harvey in Houston. The results show that the main underlying mechanism of network reticulation in evolving situations was the creation of new links by regular users. The main structure for communication instances was converging, indicating communication instances driven by information-seeking behaviors in the wake of a disruptive event. With the evolution of the network, the proportion of converging structures to self-loop and reciprocal structures did not change significantly, indicating the existence of a scale-invariance property for network structures. The findings demonstrate the capability of the proposed online network reticulation framework for characterizing the complex relationships between events, activities, and network performance in online social networks during disasters.

Introduction

Communication for making sense of disaster situations is important for people at risk to take protective actions (Fischer-Preßler et al. 2019). Social media has increasingly become a vital infrastructure for communities facing disasters and crises because social media allows people to rapidly disseminate situational information (Sutton et al. 2015), perceive environmental risks (Kryvasheyeu et al. 2016), and collaborate with other users (Lu et al. 2012). This benefits collective sense-making in a communication network involving people affected by the same disruptive event (Heverin and Zach 2012). To enhance collective sense-making in emergencies, hence, there is a need for characterizing the underlying mechanisms and modalities in communication networks through which information is processed about unfolding disaster situations. Accordingly, the field of crisis informatics has grown over the past few decades to develop computational support for data collection and user-behavioral analysis on social media to address challenges in disaster response and recovery (Anderson 2012). For example, one emerging stream of studies in crisis informatics is to examine the dynamics in online social networks to foster the information dissemination (Del Vicario et al. 2016; Morone and Makse 2015), cooperative human actions (Jackson et al. 2018; Kogan et al. 2015), and emergency protocols for disaster response (Bagrow et al. 2011; Méndez-Valderrama et al. 2018) on social media. In fact, the spread of situational information is an outcome of the structural properties of online social networks, which are shaped by dynamic user behaviors such as quoting posts and replying messages (Kim and Hastak 2018; Romero et al. 2011). In addition, as suggested in social science theories, the dynamic user behavior is stimulated by the emergent changes in users’ physical environment such as irresistible flooding and abrupt building damages (Bail 2016). Thus, to improve the understanding of network dynamics, it is essential to consider the interplay among the disaster events, dynamic user behaviors, and online social network structures.

However, the study of social media as a tool to communicate with other users and disseminate information has been more prevalent than the dynamic interactions among events, behaviors, and networks in disasters. Studies related to information dissemination over online social networks, ranging from the interaction of communities (Kumar et al. 2018), to extent of user engagement (Hu and Farnham 2015; Zhang et al. 2017), to structural patterns of dynamic cascades (Weng et al. 2013; Zang et al. 2017), provided powerful measurements for characterizing the dynamic information flow on social media. For example, Lu & Brelsford analyzed the community evolution in online social networks under the 2011 Japanese Earthquake and Tsunami (Lu and Brelsford 2014). The results show users tend to stay their own communities to get disaster-related information but fail to explain the impacts of disaster events on the activity tendencies of users within and among online communities. Kryvasheyeu et al. studied the dynamic topological properties such as users’ network centrality to measure the performance of social networks in sensing the situations in Hurricane Sandy (Kryvasheyeu and Chen 2014). The study emphasized the strategies of communicating with different users from disaster-hit areas on social media to achieve situational awareness advantages. However, the researchers did not capture the underlying triggers (e.g., disruptive events in the built environment) of the topological changes in online social networks. Thus, it is difficult to draw a robust communication strategy for people under different disaster events. Bagrow, Liu, and Mitchell demonstrated the importance of social ties for user activities by measuring information flow over online social networks (Bagrow et al. 2019). The limit of the study is that the researchers did not consider the environmental events that provide sources of information. Another stream of the studies focuses on the evolvement of the structures of online social networks. For example, Sekara et al. characterized the fundamental structure of dynamic social networks to capture its evolvement in different social contexts such as teamwork and individual lives (Sekara et al. 2016). The analysis mathematically reveals the structural patterns of the networks in a very detailed manner. However, the study did not specify the user activities that lead to the dynamics in network structures. Phan and Airoldi conducted a long-term natural experiment of friendship formation and social dynamics on Facebook in the aftermath of a natural disaster (Phan et al. 2015). The analysis indicates users’ preferences for strengthening social interactions but did not explain further influences on the efficiency of information diffusion. In addition, Kwak et al. studied the pattern of information sharing on Twitter and signified that the tweets with headline news or persistent news are trending tweets with fast diffusion (Kwak et al. 2010). They demonstrated the power of Twitter as a new medium of information sharing but did not examine the network structures that are formed by the cascades and build up the power of Twitter in information diffusion. Weng, Menczer and Ahn presented a model to predict the diffusion of memes and behaviors based on their early spreading patterns in online social networks. However, the model is limited to reveal the underlying factors such as social reinforcement that affecting the diffusion process.

The current studies, to some extent, advanced the understanding of social behaviors and network dynamics in disasters. Despite these studies (Fan et al. 2020b), there is an important gap in theoretical frameworks to characterize the dynamics of collective sense-making in online social networks by considering the complex relationships among disaster-induced disruptive events, user activities on social media, the transformation of network structure affected by user activities, and outcomes of social networks in terms of improving information propagation and collective actions in response to disruptions. To address this gap, the objective of this study is to employ a theoretical network reticulation framework to characterize different modalities influencing the dynamics of online social networks. Unlike the previous models and analysis described above, the proposed framework is now able to provide a more integrative and realistic description of the process by which the user activities perform, network structures evolve, and information propagates. Environmental effects such as infrastructure disruptions can also be accounted for as a factor whereby the online users can have dynamic activities to spread the situational information. On top of that, the network performance for information spread plays a role in enhancing the capabilities of disaster-hit populations in perceiving potential risks and building community resilience.

Online network reticulation framework

Communication theories such as Corman’s NRT (Corman and Scott 1994), Giddens’ structuration theory (Giddens 1984), and Homans’ theory of the human groups (Homans 1950) have examined some concepts related to communication networks, such as triggering events, activities, and communication instances (Fan et al. 2020a). However, these studies primarily focus on individual modalities of the communication networks. Little is known about the complex relationships among disruptive events, user activities, network structures and outcomes in online social networks. Network reticulation theory (NRT) provides an integrative framework to examine different modalities and their relationships that affect the reticulation of communication networks. Network reticulation theory (NRT), proposed by Corman and Scott (Corman and Scott 1994), provides a theoretical lens for explaining the relationship between external events, activities in social networks, communication instances, and performance in social networks. In this study, we extend and employ the standard NRT for characterizing the dynamics of online social networks in disasters and crises.

The online network reticulation (ONR) framework (Fig. 1) characterizes the dynamics of online social networks with four modalities: enactment, activation, reticulation, and network performance.

  • Enactment is promulgation over social media platforms of triggering events: in this study, built environment disruptions such as power outage, road closure, and flooding caused by a natural disaster. Triggering events often occur abruptly, promoting social media user activity—sharing information about the impact of events and adjustment response.

  • Activation modality comprises user activitiesؙ—posting and sharing of and responding to disruption events. Specifically, activities such as posting, sharing and responding situational information stimulate the emergence of communication instances regarding event-related topics among groups of users.

  • Reticulation modality is characterized by communication instances among groups of online social network users. The structural properties of communication instances and reticulation mechanisms are important elements of the reticulation modality.

  • Network performance modality is the outcome of user activities and communication instances that influence information propagation and collective action in response to disruptive events. In the proposed framework, we examine network performance modality as a function of information diffusion efficiency based on the average latent distance between the users in the network. Information propagation efficiency is an indicator of social network performance.

Fig. 1
figure 1

Online network reticulation framework for characterizing dynamics of online social networks. This framework is composed of four modalities: enactment, activation, reticulation, and network performance, which can capture the dynamic behavior of users on social media

The four modalities in the proposed framework interact with each other, and their interactions are represented by the bi-directional arrows in the schema. Specifically, the occurrences and evolution of triggering events in enactment modality trigger human activities (e.g., communication and information sharing) in the activation modality. Human activities in response to the disruptions also affect the unfolding of the events (such as relief campaigns) in the enactment modality (Chen et al. 2019). Human activities form the communication networks with dynamic reticulation structures in the reticulation modality. The reticulation structure of the communication networks determines the types and time of human activities in the activation modality, and further influence the efficiency of information sharing in network outcome modality.

In addition, each modality is characterized by a pair of elements, such as triggers/perturbation, timing/types of activities, instances/reticulation, information propagation/average distances in the network reticulation framework. The pairs of elements signify the duality of the framework to accommodate a distinction between an abstracted structural network and a concrete systemic phenomenon. Generally, the enactment modality refers to the events occurring in the physical environment, (i.e., triggering events). The activation modality is defined as the human activities triggered by physical events. Then, the reticulation modality represents the structure of the communication networks which enables the connections among affected people. The network outcome modality is the indictors of the network performance including user distance and information sharing. This framework explains the underlying mechanism of human communication networks on social media in response to physical disruptions. The four modalities and their relationships enable analyzing the dynamics of information sharing and collective sense-making in emergencies.

Using the proposed framework with a Twitter dataset from Hurricane Harvey, we first identified the triggering events in communities (such as disruptions in the built environment), and defined rules (i.e., geographic scales and event-related keywords) for filtering data for analysis. Once the data was prepared, we examined the hourly volume of relevant tweets before and during the disruption to evaluate the activation modality. Accordingly, we investigated the types of user activities (post, share, respond) and temporal tendency in the activities during three time periods (rising, peak, and declining). We then characterized network reticulation to expose the underlying mechanisms (i.e., new link creation versus existing link reinforcement) of user communications. Link, in this case, refers to the communication relations created by sharing tweets from other users. We defined three structures (converging, reciprocal, and self-loop) to characterize communication instances. Finally, we implemented an attributed network embedding approach to represent the online social network and quantify its efficiency for information propagation (as an outcome of user activities and communication instances), as described below.

To demonstrate the application of the proposed ONR framework, we focus on modeling human activities on Twitter during disaster disruptions. An approach synthetically capturing semantic similarity and relations, proposed by Aral et al. (Aral et al. 2009), was used to characterize human communication behaviors on social media. Semantic similarity refers to the similarity of the content of the tweets, and relations refers to the sharing relation created by retweets. To represent the online social network, we employed an attributed network embedding approach to integrating the semantic attributes of online users and their networked relations into a latent space (Fan et al. 2020; Huang et al. 2017a). The first step is to construct a content vector for each active user based on the user’s daily posts. Specifically, we aggregated the tweets posted each day by a user into a single document. Each user has its own tweet document. Using preprocessing approaches, including tokenization, word stemming, removing uninformative characters (e.g., “!”, “@”, and URL), removing stop-words and words whose length is less than three characters, we cleaned the tweets. Then, we adopted the term frequency-inverse document frequency (TF-IDF) approach to convert the tweet document to a vector in which each element corresponds to a token and shows the number of times that the token appears in a user’s tweet document. By doing so for all users, the user content vectors were obtained (see Fig. 2). Once the vector for each active user is determined, we used the cosine similarity method to compute the semantic similarities among all users and construct a pair-wise semantic similarity matrix S:

$$ \mathit{\cos}\left(\theta \right)=\frac{A\bullet B}{\left\Vert A\right\Vert \left\Vert B\right\Vert }=\frac{\sum_{i=1}^k{a}_i{b}_i}{\sqrt{\sum_{i=1}^k{a}_i^2}\times \sqrt{\sum_{i=1}^k{b}_i^2}} $$
(1)
Fig. 2
figure 2

A scheme of the attributed network embedding approach

where A and B are the vector representations of two users’ tweet documents, ai and bi are the values at ith element of vector A and B, and k is the length of the vectors.

Second, to capture the retweeting behaviors among the users, we extracted the pairs of user ids from all retweets. In this step, we identified the users who retweeted a post and the users who posted the original tweets. Then, we built a user-by-user matrix in which the rows and columns represent all users, and the element in the matrix represents the number of retweets generated between each pair of users. We consider this retweet relationship as communication relations among social media users. The strength of relations is defined as the value in the retweet matrix. By examining the retweet matrix, we found that the numbers of retweets vary greatly among pairs of users, although most are relatively weak. The matrix is sparse, in which most of the users did not retweet any other users’ tweets. But, some of the users retweeted a lot of posts from a specific user. These extreme values would scale the distance of two users in the latent space and intensify the skewness towards the large retweet numbers. To reduce the skewness, the hyperbolic tangent function (Xiao et al. 2005) is adopted to convert the retweet numbers into an interval where all extreme values are converted to the maximum of the interval. Hence, taking the absolute values of the relation strength would place some outliers in the embedding matrix; therefore, we scaled down the exceptionally strong relations and scaled up the differences among the weak relations by using hyperbolic tangent. Using the outputs from the hyperbolic tangent function, we created the retweet relation matrix R:

$$ {\tilde{r}}_{ij}=\mathit{\tanh}\left({r}_{ij}\right)=\frac{\mathit{\sinh}\left({r}_{ij}\right)}{\mathit{\cosh}\left({r}_{ij}\right)}=\frac{e^{2\cdotp {r}_{ij}}-1}{e^{2\cdotp {r}_{ij}}+1} $$
(2)

where rij represents the communication frequency attribute, which is defined as the value at the ith row and jth column of the matrix R; \( \tilde{r}_{ij} \) represents the scaled retweeting attribute, which is defined as the ith row and jth column of the matrix \( \overset{\sim }{R} \); and \( \overset{\sim }{R} \) is the adjusted matrix of communication relations.

After creating the user semantic similarity matrix S and the retweet relation matrix R, we adopted the attributed network embedding approach to integrate these two matrices together and project them into a latent space with reduced dimensions. The output of this step is the hidden matrix H in the latent space (see Fig. 2). The loss function for the network embedding approach is shown in Eq. (3) which consists of two models: the node proximity in networked relations and semantic attributes (Huang et al. 2017b). The first element on the right-hand side of the loss function is to let the embedding representation matrix be as close to the semantic similarity as possible; the second part is to ensure the difference of embedding representation between two users is small when they have strong communication relation. As such, the embedding representation matrix H can be generated in a unified robust and informative space.

$$ J={\left\Vert S-H{H}^{\intercal}\right\Vert}_F^2+\lambda \sum \limits_{\left(i,j\right)\in \varepsilon }\tilde{r}_{ij}{\left\Vert {h}_i-{h}_j\right\Vert}_2 $$
(3)

where λ is the regularization parameter that conducts trade-off between the representation performance of the embedding matrix H in semantic similarity and in communication relation. hi is the vector at the ith row of matrix H, representing the embedding vector for the ith user; 2 denotes the l2-norm of a vector; and F denotes the Frobenius norm of a matrix. Adopting the algorithm developed by Huang et al. (2017a) by setting the embedding vector of length 2, we obtain the two-dimensional representation matrix H for each day. That is, the representation vector h for each user has length 2.

Enactment modality

In this section, we examine the enactment modality, which includes examining the triggering event and its temporal period.

Triggering event

Hurricane Harvey, which was a Category 4 tropical storm, made landfall in Houston on August 26, 2017, and brought terribly heavy rainfall to Houston (Sebastian et al. 2017). Water levels in Barker and Addicks reservoirs and Buffalo Bayou (at the outlet of these reservoirs) reached their capacities (Fig. 3). To prevent the reservoirs from breaching, the authorities decide to release water from reservoirs into the downstream neighborhoods (Flood Control District 2017). Water released from reservoirs flooded nearby neighborhoods and caused severe damage to households, roads, and emergency facilities (Fan et al. 2018; Fan et al. 2019). This event, which occurred without any warning to residents, triggered activities on Twitter in which users shared and sought information regarding the status of water release, impacts, and response.

Fig. 3
figure 3

Inundation levels around Barker and Addicks reservoirs and Buffalo Bayou, which is the area affected by this built-environment disruption event. The water level is recorded at 12:00 p.m. each day, and the top of the spillway/bank is the baseline for measurement

Data collection and preprocessing

To model human activities in disaster disruptions, we collected a dataset of tweets sent from the Houston metropolitan area between August 22 and September 30, 2017, using profile locations and bounding boxes (Fan and Mostafavi 2019). The dataset includes tweets posted by online users whose profile locality is Houston, tweets with Houston geotags, or tweets with the geo-coordinates in predefined bounding boxes. Based on these filtering rules, the full dataset numbered 21 million tweets. The focus of our study was to examine the dynamics in online social networks during the water release event from two flood-control reservoirs, Addicks and Barker, in West Houston. This event spanned the time period between August 27 through September 5, 2017. The release of water from these reservoirs tend to only impact nearby neighborhoods. Online user activities such as retweeting or posting event-related information would indicate the users who cared about or were affected by this disruptive event. These users are considered as the users of interest in our study. Hence, we first filter all the tweets in which the keywords, “addicks”, “barker”, and “reservoir” were mentioned. Then, we extracted all the users who posted these tweets. Finally, all tweets posted by these users across the entire period of interest were collected for this study. Our filtered dataset consisted of 5865 Houston users who engaged in activities on Twitter about the disruptive event of flood-control reservoirs, 209,370 posts, 194,425 replies, and 1,166,956 shared tweets. The experiment protocols were approved by the university’s Research Compliance Committee, Institutional Review Board (IRB). All research was conducted in accordance with the guidelines and regulations of the IRB. The data does not contain any identifying information and was used anonymously. The data usage permission was obtained from Twitter prior to conducting the experiments.

To examine the presence of bots and the quality of our dataset, we adopted a widely-used bot classification system, Botometer, which has been developed for bot detection tasks and reported high accuracy (greater than 90%) in existing studies (Varol et al. 2017). Botometer can extract thousands of features (e.g., user metadata, friends, network structure, language and sentiment features) from Twitter accounts through Twitter Search API and compute bot scores for a Twitter account to evaluate the extent to which an account exhibits similarity to the characteristics of social media bots (Shao et al. 2018). The bot scores range from 0 to 5. Generally, the accounts which have scores falling into the top 20% of the score range (i.e., greater than 4 and lower than 5) would be considered as likely bots (Varol et al. 2017).

We adopted Botometer in this analysis; out of 5865 identified Twitter accounts, 593 accounts could not be inspected because they were either suspended, deleted, or turned private. For each of the remaining 5272, Botometer returned a bot score estimating the level of automation in the account. Figure 4 shows the distribution of the bot scores for the Twitter accounts in our dataset. As shown in the figure, less than 1% of the Twitter accounts in our dataset are likely bots (which have the scores greater than 4), and about 85.8% of the accounts have the scores lower than 1. By manually evaluating the account profiles and the recent tweets for some bot accounts and human accounts, we validated the classification results generated by Botometer. In addition, there are 2717 potentially bot-related tweets, which accounts for less than 0.2% of the total amount of tweets. The results of the bot test demonstrate that most of the accounts in our analysis are human accounts. The very small proportion (less than 1%) bot accounts do not affect our analysis and findings since their activities are negligible compared to the activities of the real human users.

Fig. 4
figure 4

The distribution of bot scores for Houston users in this study. The ratio of the users in the interval to the total number of users is calculated in parentheses

Activation modality

Activated by the disruptive event, users engaged in activities to seek and share information regarding the status of the water release, impacts, and adjustment responses. The Twitter activities tracked were posting, sharing (retweeting or quoting), and replying to a tweet public to users’ followers. The frequency and proportion of these three types of user activities change over time due to the unfolding of the reservoirs’ water release event. First, we collected tweets posted by users before and during the event. We then investigated the timing of the activities, as well as the temporal volume of tweets in each hour and the distribution of users in terms of the proportions of three types of activities.

Timing of activities

As shown in Fig. 5, the frequency of user activities (i.e., the hourly volume of tweets) varies over time. Before the disruption happened (August 24 through- August 26, 2017, in Fig. 5), the magnitude of user activities was constant and with a slight upward trend. A burst in activity frequency started on August 28, 2017. After that, the magnitude of user activities triggered by the event became stable (Fig. 3) and finally the frequency of user activities declined (Fig. 5).

Fig. 5
figure 5

The density distribution of tweeting activities per hour by active users. The y-axis is the absolute value of the tweets volume per hour. Three time periods, rising period (T1), peak period (T2), and declining period (T3), are determined by the volume of tweets with the same length of duration (3 days) in order to make the analysis more concise and intuitive

Accordingly, we determined three time periods based on the frequencies of activities (Fig. 5) and the unfolding of the disruptive event (Fig. 3). The rising period represents the period before the disruption, and the frequency of activities is rising in this period. Peak period represents the period immediately after the disruption occurred, and the frequency of activities is at peak volume. The declining period represents the period during the disruption, during which the frequency of activities starts declining. The basic information about the number of users and the volume of tweets during each period is shown in Table 1.

Table 1 Basic information (number of users and number of tweets)

Types of activities

The activities (post, share, and reply) allowed users to seek and share event information. Post refers to original tweets; share refers to quoted tweets and retweets; and reply refers to answering the original tweet author or another commenter. In sharing activity, users transmit information from other users to their followers. In the replying activity, users present their opinions about the information posted by other users. Both sharing and replying activities can generate tweets with links to the original tweets; therefore, all three activities can create communication instances in online social networks. Hence, examining temporal changes in activity types is an important step in characterizing activation modality and its relationship with network reticulation.

To this end, we calculated the proportion of three types of activities in each user’s profile and aggregated the results in density distribution for each kind of proportion (e.g., post/share proportion) (Fig. 6). Due to the varying numbers of users in each of the three periods, to make an intuitive comparison, we normalized the number of users by scaling the results in the rising period and declining period based on the density ratio to the peak period. The density is in log scale, and the axis, ranging from 0 to 100, is the proportion of the activities in the overall activities of a user. The overall distribution of these three types of activities is somewhat similar across the three periods. Most users preferred to share (retweet) information rather than to post and reply. We found a small group of users who post information only in three periods (bottom left corner in each ternary plot in Fig. 6). By checking their in-degree retweets, these users might be some information hubs (e.g., news reporters, emergency agency personnel, and public officials) who can gather and access timely situational information during the disruption and disseminate the information on Twitter. Despite the similarities, the differences in activities among three periods are more important for characterizing the activation modality. Specifically, during the disruptive event, the number of users with a high proportion of reply activities decreases significantly. Because the total number of users is normalized to be constant, the decreases in the left side of the figure lead to the increases in the density of the bottom right corner, indicating the number of users with a high proportion of share activities increases. The results indicate that more users tend to gather and share information from other online users (rather than replying to posts) during the disruptive event. Although reply and mentions also contribute to network growth on social media in normal situations, as reported in existing studies (Abdullah et al. 2017; Kogan et al. 2015; Metaxas et al. 2015a), we find that the primary activity contributing to network growth in crises situations is sharing. In crisis situations, people actively search for situational information and when they find useful information, they tend to share it. Such a tendency can further increase the closeness of users in online social networks and promote the performance of networks for information diffusion.

Fig. 6
figure 6

The density distribution of active users with different proportions of three types of activities: post, share, and reply in each period. The direction of the tick label (e.g., 20, 40, and 60) indicates that the cells on the same direction of the tick label represent the same value of the type of activity. The proportion of users for a certain type of activity remains the same along the direction of each tick label. For example, the direction of the tick label on the side of “Reply” is horizontal, which means the cells horizontally correspond to the tick labels. Specifically, the proportion of the replying activity is very high on the top of the triangle, while the proportion of the replying activity is very low at the bottom of the triangle

Reticulation modality

Activities by active users on Twitter confer a structure on a large volume of communication instances. Further, these communication instances form a directed network through inter-user connections. The paths of information propagation and user connections are the primary influencers of reticulation in the network. Thus, to characterize the reticulation modality in the proposed framework, we examine the reticulation mechanisms and structural properties of communication instances.

Reticulation mechanisms

Due to the increase in sharing information on Twitter during disruptive events, the connections (i.e., communication relations or links) among users increased. Consistent with existing studies, information diffusion drives the evolution of user activities, and network structures also affect the spread of information among users (Bakshy et al. 2012; Weng et al. 2013). It is important, therefore, to study what specific changes occur on the links due to users’ activities. Two main mechanisms affecting changes on the links in networks include link creation and reinforcement (i.e., the link between two users has been created via sharing and replying, and now just the frequency of sharing and replying activity rises during the period.). In addition, posited in social science theories, social influence (measured by degree centrality in networks), is an important factor driving the link creation (Aral and Dhillon 2018). Thus, it is important to examine link creation and reinforcement for users with varying levels of degree centrality. Hence, to characterize the reticulation modality, we investigated the distribution of new link creation and existing link reinforcement across users with various in-degrees, the number of times a user is retweeted.

The results shown in Fig. 7 indicate the mechanisms of network growth and the preferential attachment of online users. First, low in-degree users (left side of the x-axis) tend to create new links in the first 24 h from the occurrence of the event, while high in-degree users (right side of the x-axis) balance link creation and reinforcement. Low in-degree users are retweeted fewer times. Low in-degree users are mainly regular users who do not attract much attention from others but gather and share information from and to other users. In contrast, high in-degree users are heavily retweeted, and the information they post becomes popular among other users. They tend to be the influencers (or information hubs) who distribute important situational information to other users. Twitter data has a major limitation in terms of identifying information transmission chains. There are two ways to characterize a user as a hub: based on the number of followers or number of retweets. As documented in existing studies (Metaxas et al. 2015b; Stella et al. 2018), user retweeting behaviors are considered as a form of social endorsement or trust for the information delivered in the tweets. The more retweets a user gains, the more trust the user receives from other users. In normal situations, examining the following behavior seems to be a better approach since people tend to focus on identifying individuals of interest. In the context of disasters, however, people tend to seek reliable information from their trusted users and disseminate them by retweets (C. Zhang et al. 2019). Hence, users who received a great number of retweets play the role of information hubs in delivering reliable information to other online users (this phenomenon was examined as the emergence of influential users in a recent study (Y. Yang et al. 2019). Hence, the current study solely focused on the number of retweets for examining hubs based on retweets. Accordingly, we computed the number of links for each user by parsing the metadata in retweets, and then identify the information hubs that emerge during the evolving disaster contexts.

Fig. 7
figure 7

Reticulation mechanisms of network growth among users (red points represent new link creation, and green points represent edge reinforcement). The x and y axes in these three figures are in logarithmic scale with a base of 10. The x-axis represents the in-degrees of online users in the past 24 h, and the y-axis represents the number of links created or strengthened in the next 24 h. Because the duration of each period is 72 h, two data sets are plotted in each figure (with the same color): in-degrees from hour 0 to 24 versus the number of link creations from hour 24 to 48; and in-degrees from hour 24 to 48 versus the number of link creations from hour 48 to 72. In addition, we measure two types of link creations: new links created (red dots); and link reinforcement (green dots)

According to these results shown in Fig. 7, regular users are active in expanding their connections in order to receive information faster, while information hubs spread information to their existing connections, as well as new ones. Thus, regular users demonstrate an information-seeking pattern, while information hubs influence network reticulation by distributing situational information about the event and its impacts. The main mechanism of network reticulation for information hubs is link reinforcement during rising, peak, and declining periods.

Comparing link creation and reinforcement in different periods, we found that event and user activities stimulated more link creation and reinforcement. Since the dominant mechanism for network reticulation was link creation, new information hubs emerged during the disruptive event, and these information hubs experienced significant follower increase in a short period of time compared to regular conditions. As shown in Fig. 7, the maximum logarithmic scale of the x-axis in the peak period and the declining period increased to 3.5. This indicates that there were some users whose in-degree retweets were around 103.5, which is ten times the maximum in-degree of users in the rising period. As displayed in Table 1, however, the number of users and tweets during the peak and declining period is less than twice the number of users and tweets during the rising period. This result implies that, in seeking for event-related information, regular users find and connect with information hubs, and the information posted and shared by information hubs is furthered shared by regular users. One supporting evidence is the increase of followers for emerging influential users in disasters (Y. Yang et al. 2019). These users are first identified by other users based on their tweets/retweets and then people may decide to follow them as they find the information relevant/useful. As such, a large number of new links are created during disasters/crises instead of reinforcement of existing links (underlying network mechanism in normal situations). Hence, information hubs’ centrality and influence on network reticulation increases. Figure 7 also demonstrates that creation of new links is a dominant mechanism in network reticulation during disasters which also support the active information-seeking behavior of the users in a crisis context, as opposed to strengthening new links which is the network mechanism in normal situations. This finding also illustrates a human behavioral pattern that information needs in emergencies motivate link creation as well as the emergence of information hubs on Twitter.

Structures of communication instances

The structure of communication instances is another component in the characterization of network reticulation (Tan et al. 2016; Yang and Counts 2010). Based on findings related to the dynamics of user activities and their contributions to link creation and reinforcement from previous sections, we examined structural patterns of communication instances for the reservoir water release event. Existing studies defined three structural patterns: converging, self-loop, and reciprocal, based on the silhouette of information cascades on Weibo (Zang et al. 2017). Converge represents a user with a structure which has at least two links targeting to the user (excluding the self-loop links); self-loop denotes a user with a structure that has at least one link starting from and ending at the user; and reciprocal represents a user in at least one structure that has two reciprocal links between two users (excluding the self-loop and converging links). Similarly, we examined these structural patterns and amend it to be applicable to our Twitter data. We examine the presence of three structures in users during three periods and show their relative proportion in Fig. 8.

Fig. 8
figure 8

The overlap and independence of three important structures of communication instances

Overall, users with a converging structure in their communication instances account for the largest proportion among all users who had a communication instance for the event. This finding is consistent with the prominent users’ activity, which was sharing. Users with a self-loop structure would like to promote the spread of the information they posted. This self-retweeted activity works in most cases (Fig. 8), with users with the self-loop structure building subsequent communication instances with converging structure. In the reciprocal structure, two users mutually retweet information. Thus, users with the reciprocal structure in their communication instances would affect each other’s activities. The relation between converging structure and reciprocal structure in Fig. 8 shows that almost all users with reciprocal structure also have the converging structure. This implies that the fusion of converging structure and reciprocal structure is an embodiment of communication instances influencing network reticulation in online social networks.

Comparing Venn diagrams across the rising, peak, and declining periods, the number of users in each structure category increases during the disruptive event. One reason for this result is that the total number of users in online social networks increases when the event occurs. The peak and declining periods each have more than twice the number of users with structures than in the rising period. The result implies that communication instances by users create links (new links, as shown in the previous section) to other users and enable a burst of structures in online social networks. Finally, the results show that the proportion of the users with reciprocal structures to the number of users with converging structures remained constant (21.2%, 24.5%, and 22.9%) over different periods, although the size of the network increased. This result implies the existence of a scale-invariance property for structures that govern the growth of online social networks reticulation due to user activities triggered by disaster-induced events.

Network performance

The outcome of activities and network reticulation determines network performance with respect to a particular goal. In the case of reservoir water release during Hurricane Harvey, the goal of the online social network was information seeking and sharing. Hence, we examine the efficiency of information propagation in the network based on the average latent distance in the network over time. As explained earlier, the attributed network embedding approach integrates semantic similarity and communication relations among users into a latent space. Thus, the efficiency of information propagation in the online social networks can be quantified by the average latent distances of users in the latent space.

After obtaining the low-dimensionality matrix H, we can investigate the proximity of the active users in each day by directly calculating their Euclidean distances as follows:

$$ {d}_{ij}={\left\Vert {h}_i-{h}_j\right\Vert}_2 $$
(4)

where dij is the distance between users i and user j. The corresponding pairwise distance matrix D is a n × n matrix. Then, to measure the extent of agglomeration among the active users, we define the average latent distance of a user to other users as:

$$ {D}_i=\frac{\sum_{j=1}^n{d}_{ij}}{n} $$
(5)

where n is the number of active users in each day, Di denotes the numerical average latent distance of the user i to other active users.

Based on this approach, we obtained the average latent distance (Di) of each user and plotted the distribution of the Di for different days. As shown in Fig. 9, before the event happened, the average latent distance of users in terms of their conversations and activities on social media was extremely high, compared to the average latent distances of users during the disruption. As the event occurred and impacted communities, however, the mean value of average latent distance and its variance reduced significantly. The average latent distance values remained low until the event ended (water release from the reservoirs ended and the resulting floodwater receded). The latent distance embeds both content similarity and sharing behaviors among the users who engaged in communicating this disruptive event. Hence, the latent distance in this study can represent the extent of mutual endorsement between each other. With the decrease of the latent distance among the users, the mutual endorsement is enhanced. As such, the crisis information would spread to a broader audience in an efficient way.

Fig. 9
figure 9

The latent distances among active users each day. This box plot shows four quartiles of the distribution of the latent distances. The average latent distance is the middle line in each box and the upper and lower boundaries of the box show the third and first quartile of the latent distance in each day. The unit of the latent distance is 1, so the y-axis is the absolute value of the latent distances among users

Concluding remarks

The proposed online network reticulation framework enables understanding the relationships among disaster-induced disruptive events, user activities, network reticulation, and network performance in dynamic online social networks in disasters. The application of the proposed framework was demonstrated in the context of disruption in flood control systems in Houston during Hurricane Harvey in 2017 using an analysis of Twitter data.

One of the key findings is that the main underlying mechanism of network growth during the disruptive event was the creation of new links by regular users. This phenomenon signifies that, to gather situational information, users start to connect the users they never connected before and significantly expand their communication relationships in a short period of time. However, information hubs (a.k.a., influential users), effected reticulation in the online social network by strengthening existing links, meaning a greater number of their posts being retweeted by the users who frequent their accounts. Second, the analysis indicates the existence of homophily in online social networks where regular users connect to other regular users who seek similar situational information in disasters (Woodruff 2018). As suggested in existing studies, homophily is an important phenomenon to strengthen the closeness of online users and enable users to extend their social network by strategically targeting followees (Sun and Rui 2017). Capitalizing on this, users in disaster-hit areas can build more cohesive networks on social media through interventions such as recommending users with the accounts living in a similar situation or communicating on similar topics.

Another key finding is that the main structure for communication instances due to user activities was converging structure, indicating communication instances driven by information-seeking behaviors (retweeting or quoting) in the wake of a disruptive event. In other words, the proportion of reply/posting activities dissipates overtime after the triggering event. That is true because the posting activity is triggered by the generation of new situational information as disruptive events unfold in disasters. A reduction in the number and consequences of disruptive events leads to the reduction of information generation on social media. The sharing and quoting activities account for a larger proportion of the user activities than posting behaviors on social media in the aftermath of the disruptions.

Finally, with the growth of the network, the proportion of converging structure to self-loop and reciprocal structures did not change significantly across the entire period of the disruption. This finding indicates the existence of a scale-invariance property for structures governing the growth of the reticulation of online social networks due to user activities triggered by disaster-induced events. The scale-invariance property reveals the fundamental relationship between the size of the network and the proportions of the basic network structures. This property can inform analyses related to the reconstruction of the online social networks for simulating user activities and testing intervention strategies to improve information propagation.

Based on the findings, we can further interpret the interactions between triggering events and dynamic human behaviors in an integrative manner using the proposed online network reticulation framework. As expected, a built environment disruption event triggers an increased frequency of human activities on social media, including posting, replying and sharing activities. From the perspective of structural network properties, a disruptive event induces external perturbation that influences the structural properties of online social networks such as node centralities and betweenness. For example, in the case study, creating new links among regular users and enhancing existing links with information hubs through quoting and sharing behaviors improved the cohesiveness of the online social network. Furthermore, the increased closeness among online users enhances the efficiency of information propagation through user connections in social networks. In contrast, if new disruptive events unfold successively and the posting behavior accounts for the largest proportion of user activities, the distances among users increase and exacerbate information propagation. This is because posting new situational information will increase the semantic variances in users’ profiles on social media.

The primary contribution of the study presented in this paper is the integrative framework to characterize the dynamics of collective sense-making in online social networks in response to crises. The network reticulation modalities enable characterizing the relationships between human activities and physical disruptions in disasters. Based on the results from the study of Hurricane Harvey, the capability of the proposed framework was tested. Hence, the outcomes of this study can be adopted in other disaster contexts for characterizing the network reticulation modalities. A critical question to answer in future studies is to what extent the patterns of user activities and reticulation mechanisms vary from one disaster context to another. Through adopting the network reticulation framework, future studies examine other disaster contexts and evaluate the universality of patterns regarding user activities and reticulation mechanisms identified in Hurricane Harvey. Further adoption of the framework will enable examination of the influence of activity types, reticulation mechanisms, and communication instance structures on the performance of online social networks in the spread of information in disasters. This knowledge will also inform the design of mechanisms to improve user activities, and thus achieve better network performance. Apart from disaster studies, adoption of the online network reticulation framework can enrich studies of social media dynamics in other contexts such as politics and marketing to better understand the interplay among user activities, the influence of social, political and technological events, and the performance of online social networks. This understanding will inform the planning of marketing and political campaigning strategies.

Availability of data and materials

The data that support the findings of this study are available from Twitter, but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Twitter. The authors have obtained the permission from Twitter prior to using this data.

Abbreviations

NRT:

Network reticulation theory

ONR:

Online network reticulation

OSN:

Online social network

References

Download references

Acknowledgements

This material is based in part upon work supported by the National Science Foundation (NSF) under Grant Number IIS-1759537, CMMI-1846069 and the Amazon Web Services (AWS) Machine Learning Award. The authors also would like to acknowledge the funding support from the National Academies’ Gulf Research Program Early-Career Research Fellowship. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation, the National Academies’ Gulf Research Program, and Amazon Web Services.

Funding

National Science Foundation (NSF) under Grant Number IIS-1759537, CMMI-1846069, the Amazon Web Services (AWS) Machine Learning Award, and the National Academies’ Gulf Research Program Early-Career Research Fellowship.

Author information

Authors and Affiliations

Authors

Contributions

All authors designed the study; C.F. and J.S. performed the analysis; C.F. and A.M. wrote the paper; C.F., A.M., and X.H. revised the paper. The author(s) read and approved the final manuscript.

Corresponding authors

Correspondence to Chao Fan or Ali Mostafavi.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, C., Shen, J., Mostafavi, A. et al. Characterizing reticulation in online social networks during disasters. Appl Netw Sci 5, 29 (2020). https://doi.org/10.1007/s41109-020-00271-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41109-020-00271-5

Keywords