To answer Q2 - Is a person’s centrality uniform, i.e. more or less equally distributed across the social sites in use? - we treat the user degree on different social sites as an index of centrality in the network. This information, reported on the Alternion profile, is not available for every site since just a few APIs allow to retrieve this information. So, we limit our analysis on the degree to Facebook, Twitter (in/out), LinkedIn and YouTube (in/out).
Q2 corresponds to verify if statistically significant correlations between the degrees of the same group of users in different sites exist. The presence of these correlations measures whether or not popular users in one site maintain their centrality across media. If we denote \({k_{u}^{s}}\) as the degree of the node u in the site s, we can evaluate the degree of correlation between pairwise social sites by adopting different methods Boccaletti et al. (2014). First we compute, for each pair of social sites, the joint distribution P(k
s1,k
s2) to obtain a characterization of the relations between the degree sequences. For example, in Fig. 9, where we show the joint distribution for Facebook and LinkedIn, we observe a slightly positive correlation, especially in the left bottom part of the distribution (dark blue regions). In particular people with about 200 friends in Facebook and about 20/25 relationships in LinkedIn are more likely than others.
By analyzing the joint distributions only, it is difficult to compare the relations between a social site against the other ones. To this aim, we compute the average degree of a node in a social site s
1 conditioned on the degree in the site s
2. The resulting computation for LinkedIn conditioned on Facebook has been shown in Fig. 10. In this case we observe an initial increase of the average degree in LinkedIn as a function of the Facebook degree up to about 500 friends. Then we observe a more unstable trend also due to a low number of data points. In the Fig. 11 we report a more pronounced lack of correlation involving the out-degree in Twitter and YouTube. Here we cannot establish an increasing relation between the out-degrees, so people how follow many users in Twitter would not follow the same amount of channels in YouTube. The specificity of the two social platforms, i.e. news broadcasting and video sharing, could be the cause of the weak correlation.
Keeping Facebook as conditioning social site, we observe that the average degree increases in some media like LinkedIn and Twitter, while the in/out degrees in YouTube are uncorrelated with the Facebook degree; as shown in Fig. 12. The comparison between the degree and the in/out degree is dictated by the inability to extract mutual links (more similar to a friendship link in Facebook or LinkedIn) since we can only retrieve the counting of the followers/followees in Twitter and YouTube.
Finally, to get an overall picture of the pairwise degree correlations, we apply a rank correlation analysis on the different pairwise sequences. Rank correlation analysis allows us to test if the ranking induced by the different degrees is similar or not. As a rank correlation method, we compute the Kendall’s rank correlation coefficient τ
b
7 on the ranking induced by the degrees. In Fig. 13 we visualize the rank correlation matrix, where each row (column) corresponds to a different social site. A strong positive correlation does not exist, rather the scenario is multifaceted. In most pairs, there is only a limited positive correlation (0.1−0.23) between degree centralities. This means that users may have a very different centrality across the services, i.e. a single user might be an hub on one system and loose part of its hubbiness on the other. One reason may rely on the different goals of the services; whereas LinkedIn is business-oriented or Twitter is an interest network, Facebook incorporates all the previous features. So, for instance, LinkedIn may only capture a part of the Facebook friends.
The above conjecture cannot be verified since we are not able to compute the intersection between a node’s neighborhoods in different social networks (information not provided by Alternion). However we can quantify and analyze the difference between the neighborhoods in terms of their size. To this aim, given a user u and two sites s
1 and s
2, we define the friend deviation
\(\Delta k^{s1s2}_{u} \) as:
$$ \Delta k^{s1s2}_{u}=k^{s1}_{u}- k^{s2}_{u} $$
(1)
We compute the friend deviation between Facebook and Linkedin(\(\Delta k^{FL}_{u}\)), Facebook and Twitter(\(\Delta k^{FT}_{u}\)), Facebook and YouTube(\( \Delta k^{FY}_{u}\)), Twitter and LinkedIn(\(\Delta k^{TL}_{u}\)) and Twitter and YouTube(\(\Delta k^{TY}_{u}\)) for the users who have joined them. We report the trends of the friend deviations in Fig. 14 sorted in decreasing order. We observe that \(\Delta k^{FT}_{u}\), \(\Delta k^{FL}_{u}\), \(\Delta k^{FY}_{u}\), \(\Delta k^{TL}_{u}\) and \(\Delta k^{TY}_{u}\) are positive for 5360 users (out of 8527), 3313 users (out of 4361), 2944 users (out of 3177), 2518 users (out of 4148) and 2760 users (out of 3098), respectively. These results indicate that users in our dataset prefer to create friendships in Facebook rather than in LinkedIn, Twitter and YouTube. Moreover, users prefer Twitter rather than LinkedIn and YouTube to establish friendships. One remarkable result is that users preferring Twitter rather than Facebook have significantly more friends than those they have in Facebook.
In general maintaining the importance across social media is not a straightforward task and asks for a deeper understanding; for example it is not clear how user’s neighborhoods in different media overlap.
Finally we apply the above methodology i) to investigate how often users publish contents in different social sites, and ii) to asses if a form of correlation exists between the amount of posts and the number of friends. For point i), we are verifying whether users who post a lot and often on a social platform, are equally active in other platforms (see Q3). We measure the activity level of a user within a social media by means of the posting rate, measured in number of posts per week. By considering the posting rate rather than the post count, we mitigate the effects given by the adoption of social media in different periods. We report results about the analysis of the rank correlation matrix applied to pairwise posting rate sequences. In Fig. 15 we visualize the Kendall’s coefficient τ
b
for the most used pairs of sites. Unlike the above discussion on centrality, there is a more evident positive correlation between the posting activity across social sites. The obtained values do not mean that users are equally active on both social media, however there is a positive tendency to be active in different social sites. In general, the maintenance of the posting activity across social media is not a straightforward task, like in the degree analysis.
By the second point we wonder if users with many friends in a OSN are more active and productive than people with fewer friends. To this aim, we consider only users whose degree and posting rate are available and combine these information. The analysis of the Kendall’s coefficient τ
b
highlights a medium positive correlation between the two variables for all the online social networks (τ
b
∈[0.27,0.4]) except YouTube (τ
b
=0.1). High degree people tend to post and publish more than those with few friends.
A case study on 4 social media
Up to now the analysis has focused on the pairs of social media. In this section we present a particular case study that involves 524 users who have joined Google+, Pinterest, LinkedIn and Twitter. We select this subset of social media because they have the highest number of users w.r.t. other subsets of 4 elements and their users are also very active. For instance in Twitter users published 1025 elements on average, followed by LinkedIn (139.32), Google+ (137.55) and Pinterest (134.71). In fact, we observe that less than 40 % of users in Google+, Pinterest and LinkedIn have more than 100 posts. While in Twitter almost 87 percent of the users produced more than 100 posts.
In the light of the results about the moderate correlation among the posting rates, we wonder if, in this group, people who actively post in a service necessarily produce many posts in the other services. To this aim we denote the top 5 % of users in each media as most active users. Only two users are the most active in all the services. 36 %, 30 % and 42 % of the most active users in Twitter are also the most active in Google+, Pinterest and LinkedIn respectively. 23 % and 14 % of the most active users in Google+ are also the most active in Pinterest and LinkedIn respectively. The above observations support the presence of users who are very active pairwise but, whereas the number of sites actively used increases, the number of posts and consequently the productivity reduces as observed in the previous section.
We quantify the diversity of the posting activity across social media by defining the posting deviation \(\Delta p^{s1s2}_{u}\) of a user u as:
$$ \Delta p^{s1s2}_{u}=\frac{|np^{s1}_{u}-np^{s1}_{u}|}{max(np^{s1}_{u},np^{s1}_{u})} $$
(2)
where \(np^{s1}_{u}(np^{s2}_{u})\) denotes the number of posts of user u on the social media s1 (s2). \(\Delta p^{s1s2}_{u}\) ranges from 0 to 1: if Δ
p decreases towards 0, u tends to publish the same number of posts both in s1 and s2. As shown in Fig. 16, where we report the posting deviation Δ
p between Google+ and the other social networks, this quantity decreases linearly. From the figure emerge that users’ behavior is more variable between Google+ and Twitter than the other social networks. Only 9 % of users show a post deviation less than 0.5. Most of the users prefer to post on a single service. So users who laboriously publish posts in one service do not publish with the same rate in the other ones.