Communication theories such as Corman’s NRT (Corman and Scott 1994), Giddens’ structuration theory (Giddens 1984), and Homans’ theory of the human groups (Homans 1950) have examined some concepts related to communication networks, such as triggering events, activities, and communication instances (Fan et al. 2020a). However, these studies primarily focus on individual modalities of the communication networks. Little is known about the complex relationships among disruptive events, user activities, network structures and outcomes in online social networks. Network reticulation theory (NRT) provides an integrative framework to examine different modalities and their relationships that affect the reticulation of communication networks. Network reticulation theory (NRT), proposed by Corman and Scott (Corman and Scott 1994), provides a theoretical lens for explaining the relationship between external events, activities in social networks, communication instances, and performance in social networks. In this study, we extend and employ the standard NRT for characterizing the dynamics of online social networks in disasters and crises.
The online network reticulation (ONR) framework (Fig. 1) characterizes the dynamics of online social networks with four modalities: enactment, activation, reticulation, and network performance.
Enactment is promulgation over social media platforms of triggering events: in this study, built environment disruptions such as power outage, road closure, and flooding caused by a natural disaster. Triggering events often occur abruptly, promoting social media user activity—sharing information about the impact of events and adjustment response.
Activation modality comprises user activitiesؙ—posting and sharing of and responding to disruption events. Specifically, activities such as posting, sharing and responding situational information stimulate the emergence of communication instances regarding event-related topics among groups of users.
Reticulation modality is characterized by communication instances among groups of online social network users. The structural properties of communication instances and reticulation mechanisms are important elements of the reticulation modality.
Network performance modality is the outcome of user activities and communication instances that influence information propagation and collective action in response to disruptive events. In the proposed framework, we examine network performance modality as a function of information diffusion efficiency based on the average latent distance between the users in the network. Information propagation efficiency is an indicator of social network performance.
The four modalities in the proposed framework interact with each other, and their interactions are represented by the bi-directional arrows in the schema. Specifically, the occurrences and evolution of triggering events in enactment modality trigger human activities (e.g., communication and information sharing) in the activation modality. Human activities in response to the disruptions also affect the unfolding of the events (such as relief campaigns) in the enactment modality (Chen et al. 2019). Human activities form the communication networks with dynamic reticulation structures in the reticulation modality. The reticulation structure of the communication networks determines the types and time of human activities in the activation modality, and further influence the efficiency of information sharing in network outcome modality.
In addition, each modality is characterized by a pair of elements, such as triggers/perturbation, timing/types of activities, instances/reticulation, information propagation/average distances in the network reticulation framework. The pairs of elements signify the duality of the framework to accommodate a distinction between an abstracted structural network and a concrete systemic phenomenon. Generally, the enactment modality refers to the events occurring in the physical environment, (i.e., triggering events). The activation modality is defined as the human activities triggered by physical events. Then, the reticulation modality represents the structure of the communication networks which enables the connections among affected people. The network outcome modality is the indictors of the network performance including user distance and information sharing. This framework explains the underlying mechanism of human communication networks on social media in response to physical disruptions. The four modalities and their relationships enable analyzing the dynamics of information sharing and collective sense-making in emergencies.
Using the proposed framework with a Twitter dataset from Hurricane Harvey, we first identified the triggering events in communities (such as disruptions in the built environment), and defined rules (i.e., geographic scales and event-related keywords) for filtering data for analysis. Once the data was prepared, we examined the hourly volume of relevant tweets before and during the disruption to evaluate the activation modality. Accordingly, we investigated the types of user activities (post, share, respond) and temporal tendency in the activities during three time periods (rising, peak, and declining). We then characterized network reticulation to expose the underlying mechanisms (i.e., new link creation versus existing link reinforcement) of user communications. Link, in this case, refers to the communication relations created by sharing tweets from other users. We defined three structures (converging, reciprocal, and self-loop) to characterize communication instances. Finally, we implemented an attributed network embedding approach to represent the online social network and quantify its efficiency for information propagation (as an outcome of user activities and communication instances), as described below.
To demonstrate the application of the proposed ONR framework, we focus on modeling human activities on Twitter during disaster disruptions. An approach synthetically capturing semantic similarity and relations, proposed by Aral et al. (Aral et al. 2009), was used to characterize human communication behaviors on social media. Semantic similarity refers to the similarity of the content of the tweets, and relations refers to the sharing relation created by retweets. To represent the online social network, we employed an attributed network embedding approach to integrating the semantic attributes of online users and their networked relations into a latent space (Fan et al. 2020; Huang et al. 2017a). The first step is to construct a content vector for each active user based on the user’s daily posts. Specifically, we aggregated the tweets posted each day by a user into a single document. Each user has its own tweet document. Using preprocessing approaches, including tokenization, word stemming, removing uninformative characters (e.g., “!”, “@”, and URL), removing stop-words and words whose length is less than three characters, we cleaned the tweets. Then, we adopted the term frequency-inverse document frequency (TF-IDF) approach to convert the tweet document to a vector in which each element corresponds to a token and shows the number of times that the token appears in a user’s tweet document. By doing so for all users, the user content vectors were obtained (see Fig. 2). Once the vector for each active user is determined, we used the cosine similarity method to compute the semantic similarities among all users and construct a pair-wise semantic similarity matrix S:
$$ \mathit{\cos}\left(\theta \right)=\frac{A\bullet B}{\left\Vert A\right\Vert \left\Vert B\right\Vert }=\frac{\sum_{i=1}^k{a}_i{b}_i}{\sqrt{\sum_{i=1}^k{a}_i^2}\times \sqrt{\sum_{i=1}^k{b}_i^2}} $$
(1)
where A and B are the vector representations of two users’ tweet documents, ai and bi are the values at ith element of vector A and B, and k is the length of the vectors.
Second, to capture the retweeting behaviors among the users, we extracted the pairs of user ids from all retweets. In this step, we identified the users who retweeted a post and the users who posted the original tweets. Then, we built a user-by-user matrix in which the rows and columns represent all users, and the element in the matrix represents the number of retweets generated between each pair of users. We consider this retweet relationship as communication relations among social media users. The strength of relations is defined as the value in the retweet matrix. By examining the retweet matrix, we found that the numbers of retweets vary greatly among pairs of users, although most are relatively weak. The matrix is sparse, in which most of the users did not retweet any other users’ tweets. But, some of the users retweeted a lot of posts from a specific user. These extreme values would scale the distance of two users in the latent space and intensify the skewness towards the large retweet numbers. To reduce the skewness, the hyperbolic tangent function (Xiao et al. 2005) is adopted to convert the retweet numbers into an interval where all extreme values are converted to the maximum of the interval. Hence, taking the absolute values of the relation strength would place some outliers in the embedding matrix; therefore, we scaled down the exceptionally strong relations and scaled up the differences among the weak relations by using hyperbolic tangent. Using the outputs from the hyperbolic tangent function, we created the retweet relation matrix R:
$$ {\tilde{r}}_{ij}=\mathit{\tanh}\left({r}_{ij}\right)=\frac{\mathit{\sinh}\left({r}_{ij}\right)}{\mathit{\cosh}\left({r}_{ij}\right)}=\frac{e^{2\cdotp {r}_{ij}}-1}{e^{2\cdotp {r}_{ij}}+1} $$
(2)
where rij represents the communication frequency attribute, which is defined as the value at the ith row and jth column of the matrix R; \( \tilde{r}_{ij} \) represents the scaled retweeting attribute, which is defined as the ith row and jth column of the matrix \( \overset{\sim }{R} \); and \( \overset{\sim }{R} \) is the adjusted matrix of communication relations.
After creating the user semantic similarity matrix S and the retweet relation matrix R, we adopted the attributed network embedding approach to integrate these two matrices together and project them into a latent space with reduced dimensions. The output of this step is the hidden matrix H in the latent space (see Fig. 2). The loss function for the network embedding approach is shown in Eq. (3) which consists of two models: the node proximity in networked relations and semantic attributes (Huang et al. 2017b). The first element on the right-hand side of the loss function is to let the embedding representation matrix be as close to the semantic similarity as possible; the second part is to ensure the difference of embedding representation between two users is small when they have strong communication relation. As such, the embedding representation matrix H can be generated in a unified robust and informative space.
$$ J={\left\Vert S-H{H}^{\intercal}\right\Vert}_F^2+\lambda \sum \limits_{\left(i,j\right)\in \varepsilon }\tilde{r}_{ij}{\left\Vert {h}_i-{h}_j\right\Vert}_2 $$
(3)
where λ is the regularization parameter that conducts trade-off between the representation performance of the embedding matrix H in semantic similarity and in communication relation. hi is the vector at the ith row of matrix H, representing the embedding vector for the ith user; ‖∙‖2 denotes the l2-norm of a vector; and ‖∙‖F denotes the Frobenius norm of a matrix. Adopting the algorithm developed by Huang et al. (2017a) by setting the embedding vector of length 2, we obtain the two-dimensional representation matrix H for each day. That is, the representation vector h for each user has length 2.