Multichannel social signatures and persistent features of ego networks

The structure of egocentric networks reflects the way people balance their need for strong, emotionally intense relationships and a diversity of weaker ties. Egocentric network structure can be quantified with ’social signatures’, which describe how people distribute their communication effort across the members (alters) of their personal networks. Social signatures based on call data have indicated that people mostly communicate with a few close alters; they also have persistent, distinct signatures. To examine if these results hold for other channels of communication, here we compare social signatures built from call and text message data, and develop a way of constructing mixed social signatures using both channels. We observe that all types of signatures display persistent individual differences that remain stable despite the turnover in individual alters. We also show that call, text, and mixed signatures resemble one another both at the population level and at the level of individuals. The consistency of social signatures across individuals for different channels of communication is surprising because the choice of channel appears to be alter-specific with no clear overall pattern, and ego networks constructed from calls and texts overlap only partially in terms of alters. These results demonstrate individuals vary in how they allocate their communication effort across their personal networks and this variation is persistent over time and across different channels of communication.


I. INTRODUCTION
Social relationships that are strong and supportive are fundamentally important for health and well-being, in both humans and other primates [1][2][3][4][5].While close, emotionally intense relationships provide support and cohesion, weaker ties have been associated with the benefits of diversity and access to resources outside one's everyday social circles [6,7].At the same time, maintaining social ties comes at a cost: time and cognitive resources are finite [8,9].This cost is particularly high for close relationships [10].As a result, personal networks typically have only a few close ties and many weak ties.This is visible both at the level of entire social networks [11] as well as in how people structure their personal networks [12].
The way people balance their need for strong, cohesive ties and weak ties that lead outside their closest network is reflected in so-called social signatures.The social signature of an ego measures the fraction of communication targeted at alters of each rank, when the alters are ranked according to this fraction.Social signatures therefore quantify rank-frequency relationships of alters * sara.heydari@aalto.fi in egocentric networks.In Ref. [12], it was shown that people place their mobile telephone calls to their alters very unevenly across their ego networks, so that a few closest alters get a disproportionate fraction of calls.This is reflected in social signatures that typically decay slower than exponentially.It was also shown in Ref. [12] that each individual has their own, distinctive social signature that persists in time, even when there is a large amount of turnover in the ego network.Similar observations were made in Ref. [13] with a different dataset on mobile telephone calls.
However, social relationships are shaped and maintained through a diversity of communication channels [14][15][16][17][18][19].People do not use these channels uniformly -rather, the choice of channel depends on many factors.These include the type of relationship (nature of social tie), general channel preferences, the time of the event (social norms) and the reason for communicating; see, e.g., [15] on why texters text.To examine if the properties of social signatures are generalizable and genuine features of egocentric networks, it is therefore important to look at data from multiple channels of communication, both separately and together.Combining information on different channels can, however, be problematic because of their intrinsic differences.For example, the number of calls or their total duration is typically used as a proxy for tie strength in mobile telephone call data [20].But text messages, another common form of communication via mobile devices, have no duration, and the number of text messages between an ego-alter pair is not directly comparable to the number of calls between that pair.While one call can be thought to represent one conversation, one text message is typically only a part of a longer conversation.
In this paper, we study social signatures that are based on calls, texts, and both.To this end, we develop a way of constructing weighted ego networks from time-stamped communication data that makes different channels more comparable (see Fig. 1), and also allows for the construction of mixed social signatures based on both call and text message data.We apply this method to two data sets on mobile telephone communication, and observe that both single-channel and mixed signatures are persistent over time, as observed earlier for calls-only signatures.We also observe that the call and text signatures are surprisingly similar for each ego.This is unexpected, because at the same time, the call and text networks of most egos overlap only partially, and there are no clear patterns of channel preference: the choice of channel appears independent of alter rank in mixed signatures.

A. Datasets
We use two data sets of mobile telephone calls and text messages (see Table I).Data set DS1 comprises the Call Detail Records (CDRs) for calls and text messages of the anonymized customers of a mobile operator in an European country, collected over 7 months (see, e.g., [21,22]).We applied an activity threshold and retained only users with more than 20 calls and more than 7 text messages per month, retaining 506,330 users.Data set DS2 contains the times and recipients of outgoing calls and text messages for 24 students in the UK [12,23].The data collection period is 18 months, during which the students graduated from high school and moved on to University or work.
As our aim was to construct social signatures and study their persistence in time, we divided both data sets into two equal-sized consecutive time intervals (3.5 months each for DS1 and 9 months for DS2); this was for being able to compare the stability of the shapes of the signatures for the first and second halves.The choice to split into two was merely for convenience; please note that in [12], DS2 was analyzed using three intervals, yielding similar results for calls.

B. Constructing egocentric networks and social signatures
Social signatures are calculated from weighted egocentric networks, where the link weights represent the amount of communication between the focal ego and the ego's alters.Social signatures measure the fraction of communication to alters of each rank, when the alters are ranked according to this fraction.In Ref. [12], the number of outgoing calls that took place during the data collection period were used as weights.However, when there are multiple channels, the question of how to define weights is not straightforward.The simplest solution would be to use the number of communication events as the weight for all channels.However, this is problematic.In our case of calls and texts, as disussed above the numbers of calls and texts cannot be directy compared.One call can be associated with one conversation, while one conversation by texting may amount to a large number of individual text messages.
Here, our aim is to make the channels more comparable by focusing on their timelines and coarse-graining events in time.We do this as follows: we take the timeline of each ego-alter link, and divide it into time bins of one hour.Note that one hour has been chosen for convenience and to be clearly longer than the time scale of tens of seconds to minutes associated with correlated calls or texts [21,24]).Then, for both calls and texts, we count the number of bins which contain at least one communication event (see Fig. 1A).Thus we count the number of one-hour time bins in which at least one communication activity takes place.These counts are then used as link weights for the egocentric networks: e.g., a weight of w = 5 indicates that there were five hours where there was call activity with the alter.Calls that begin in one time bin and stretch along several time bins contribute accordingly to several units of weight.Defining link weights on the basis of time bins also makes it possible to construct mixed link weights, as one can count the number of time slots where either at least one call OR one text message took place.An advantage of this method is that it can be used to calculate link weights that quantify the amount of communication or social interaction in any channel, as long as the time stamps of interaction events are available for each ego-alter link.
With the time-bin-based weights, social signatures are calculated as in [12]: for each egocentric network, alters are ranked according to their link weight, and the fraction of link weight out of the sum of all link weights is computed as function of alter rank.The social signature of ego i then reads where the alters j are sorted by weight in decreasing order and k i is the degree (number of alters) of i.  A) The timelines corresponding to each of the ego's alters are divided into bins-we use bins that span one hour.Then, the number of bins with at least one communication event is computed.These numbers are used as link weights for egocentric networks (panel B).For the mixed networks, the link weights represent the number of bins where either calls, texts, or both are taking place.

C. Comparing Social Signatures
In order to determine the persistence of social signatures in time, a way of comparing their shapes and measuring the similarity or difference between two given signatures is needed.In ( [12]), the Jensen-Shannon divergence [25] (JSD) was used for comparing pairs of social signatures and we also use the JSD in this analyses.The JSD is defined as: where σ 1 and σ 2 are two social signatures, as defined in Eq. 1, and H(σ) is the Shannon entropy of σ.
The Jensen-Shannon divergence is a generalized form of the Kullback-Leibler divergence.The square root of the Jensen-Shannon divergence can be used as a distance function.Because the JSD can deal with zero probabilities, it allows us to compare social signatures of different lengths, that is, signatures computed from egocentric networks with different numbers of alters, To compare two signatures of lengths k 1 and k 2 where k 2 > k 1 , we append zero entries (w ij = 0) to the shorter signature for k 1 < j ≤ k 2 so that both signatures are of equal length.
The overlap of two sets of alters in a pair of egocentric networks can be measured by the Jaccard coefficient where σ 1 and σ 2 are the social signatures corresponding to the networks.As an example, the Jaccard coefficient between the call signature and text signature of an ego is defined as the number of alters the ego has contacted by both call and text in the period divided by number of alters the ego has contacted by call or text.If there is complete overlap between the alters contacted by call and text, then J = 1.If there is no overlap between alters contacted by call and text, then J = 0.

A. Single-Channel and Mixed Signatures Are Persistent
We begin our analysis by demonstrating that all three types of signatures -call, text, and mixed -are persistent at the level of individuals, as was shown for call-based signatures for DS2 in Ref. [12].Here, we define persistence as the social signature retaining its shape over time, with individual level variation in JSD that is smaller than the average JSD between signatures in the whole population.
To examine this persistence, we divide the data collection periods of the two sets into two intervals of 3.5 and 9 months for sets DS1 and DS2, respectively.We then calculate the weighted egocentric networks for each ego in each interval.As explained in detail above, we use the number of one-hour time bins with calls, texts, or either for determining the link weights between the ego and alters.We compute the social signatures for each egocentric network and each interval by ranking alters according to their weight and calculating the fraction of weight at each rank.Following Ref. [12], we then calculate self-distances by computing the JSD between an ego's own signatures in consecutive intervals.We also calculate reference distances by computing the average JSD between the signature of the ego and those of all other egos.We repeat these calculations for both channels (calls and texts) as well as mixed networks (calls and texts).
The distributions of self and reference distances of call, text, and mixed signatures are displayed in Fig. 2. For all three types of signatures and for both data sets, the bulk of the distribution of self-distances clearly lies below the reference distances.Self-distances are on average smaller than the distance between the signatures of two random egos, and there is less spread in their distribution.These differences in the distributions of Fig. 2 indicate that the changes of an ego's signature in time are smaller than the variation of signature distances in the population, whether calculated from calls or texts or both.This means that the individual differences in signature shapes are a real feature of the egocentric networks instead of random variation resulting in noisy, unstable signatures.The persistence of social signatures is therefore not only a feature of egocentric networks built from phone call data, but a more general phenomenon.

B. Single-Channel and Mixed Signatures Have
Similar Shapes, Even at the Ego Level We have now established that the three types of signatures are persistent characteristics of egocentric networks.Next, we compare the shapes of these signatures, first at the population level and then at the level of individuals.It was shown in Ref. [12] that call signatures in DS2 are rather skewed: a small number of top-ranking al-ters get a disproportionate share of communications.We find that all three types of signatures show this skewed shape at the level of individuals and at the population level.This is seen in Fig. 3 that shows the three types of signatures of one person (a) and the population-averaged signatures (b).
It also appears that the two types of single-channel signatures are more similar for each ego than they are between egos-the shapes of the call and text signatures of an ego look similar.This is confirmed by use of the JSD.We calculated the self-distances between an ego's call and text signature as well as reference distances between all pairs of call and text signatures, aggregated over the entire period of observation.The resulting distributions for both data sets again indicate that self-distances are on average smaller than reference distances (Fig. 3).Even though the difference is slightly less pronounced than for distances of the same signature type between different intervals (Fig. 2), the shapes of call and text signatures of an ego appear to correlate.

C. Single-Channel Egocentric Networks Differ in Composition
The similarity in the shapes of the call and text signatures of each ego would be expected if their call and text networks were similar and included the same alters with similarly ranked weights.However, this is not the case: the call and text networks of an ego are typically different.Instead of the same alters appearing in both networks, many alters are only called or texted, and therefore included in one network only.This is in line with literature on network-level differences [16][17][18].
This can be seen for both datasets in the distributions of Jaccard indices in Fig. 4 a) and b), computed for the sets of called and texted alters of each ego.The values of the Jaccard indices are mostly low.This means that while some alters are in both networks, most alters are not.Also, as seen in the lower panels (c,d) of Fig. 4, the ranks of those alters who are present in both call and text networks correlate only moderately: an alter who is among the most called alters may receive a far smaller share of text messages.

D. Channel Choice Does not Depend on Alter Rank
Next, we investigated whether there are systematic differences in the call and text networks of egos; such differences might explain the signature shapes and their similarities, despite call and text networks being different.To this end, we take a look at mixed egocentric networks calculated using both calls and texts, and investigate their weight composition.We focus on the share of calls and texts for each rank; note that since we are counting time slots, there are slots with both channels present.One example mixed signature and its weight composition are shown in Fig. 5.It appears that there is no clear pattern, except perhaps an slightly increased focus on calls around ranks 11-16.A likely explanation is that the choice of communication channel depends on features of the relationship in question other than its emotional closeness that correlates with ranks.This is supported by Fig. 6 that shows the shares of text messages in all ego-alter relationships of DS2 (top) and a large sample of ego-alter relationships of DS1 (bottom).In DS2, the only systematic feature is that alters at top ranks typically receive both calls and text messages, and the fraction of text-only and call-only relationships increases towards the lower ranks.Top relationships appear more balanced regarding communication channels in DS1 too.Beyond that, there are no systematic changes that depend on alter rank.

IV. DISCUSSION
Social signatures quantify how people allocate their communication across the members of their personal networks.In this paper, we used two separate datasets to explore how social signatures based on calls, texts or both vary over time and across individuals.There were three key findings.First, individuals vary in how they allocate their time across their ego networks and this variation is persistent over time, despite a turnover of individuals alters in the network.This finding was initially reported in Ref. [12] using a small sample of 24 students of a similar age going through the specific transition from school to University or work (DS2 in this paper); Ref. [13] confirmed the finding for a data set of N = 93.The current paper replicates this finding in a much larger, more demographically diverse sample of over 500,000 people.Second, this individual variation in social signatures was present across different channels of communication -phone calls, text messages and a combined network based on both calls and texts.This is despite the fact that there was little overlap in the individuals who were called and texted.Third, regardless of the channel, the top alters get a disproportionately large fraction of all communication.This is seen in the shapes of three types of signatures (calls, texts, and mixed).Thus individual variation in social signatures does not appear to depend on the channel of communication or the specific alters in the network at a particular point in time, but instead reflects a stable characteristic of how individuals distribute their communication effort across their personal networks.
Why are the social signatures from different channels similar?One could envision an underlying complete egocentric network with tie strengths that measure the closeness of all the relationship an ego maintain with their alters.Within this network there is a distribution of tie strengths where there are few strong and many weak ties [26].Then observations on one channel of commu-nication would be incomplete samples of the underlying complete network (see, e.g., [16][17][18] for studies on network-level differences between calls and texts).Individuals differ in how they allocate their communication across their network, some individuals allocating a greater fraction of communication to a smaller number of alters and others allocating communication more evenly across their network.Thus constructing the social signatures based on different channels of communication would still pick up this individual variation, even if the specific alters detected by the different channels of communication vary.There may also be less fundamental reasons: communication habits, memory effects, or similar.However, it has been shown that for calls, alter ranks do correlate with emotional closeness [12], which supports the first explanation.To fully understand which sample of alters is captured by different channels of communication, further research is needed on how people use different channels of communication to maintain their set of relationships to family and friends, and how this communication is related to the underlying tie strength of the relationship [14,15,27].
The finding that individuals have social signatures that are stable over time and persist despite the turnover of individual alters has now been shown in a number of samples from different countries and across different channels of communication including phone calls [12,13], text messages and combined call and text networks (this study) and email [28].Given the robustness of this finding, further research is now needed on the causes and consequences of individual variation in social signatures.Whilst everyone is subject to similar fundamental time and cognitive constraints on sociality [26], the way people choose to allocate their communication effort across their networks shows stable individual variation.Some of this individual variation appears to be due to personality characteristics [13], which are also broadly stable over time [29].Other characteristics that may be associated with individual variation in social signatures are age and gender, which are linked with variation in communication patterns [30] and friendship styles [31].Further, given the importance of social relationships to health and well-being [1][2][3][4][5] individual variation in social signatures may have consequences for outcomes such as stress and loneliness.Whilst all people distribute their communication very unevenly across their network [12], some people focus an even greater proportion of their communication on a smaller number of alters.Further research could examine how these different patterns of time allocation across the network are linked to well-being, particularly during times of network change which put pressure on the time required to maintain relationships, such as the transition to University [23] or entering into romantic relationships [32].Further, it would be important to see that our results can be replicated with other data sets containing calls and texts; as usual for this kind of data, our data cannot be made public because of privacy reasons.To conclude, this study demonstrated using two separate samples that there is individual variation in the way people allocate their time across their social networks, and these social signatures are persistent over time and across different channels of communication.
V. DECLARATIONS Availability of data and material.The source data used in this study cannot be made publicly available be-cause of privacy restrictions.Ref. [12] has some Supplementary Data in relation to DS2.
Competing interests.The authors declare that they have no competing interests.
Funding.SH and JS acknowledge funding from the Academy of Finland, project n:o 297195.

FIG. 1 :
FIG.1: Constructing egocentric networks from calls and texts using time-binned weights.A) The timelines corresponding to each of the ego's alters are divided into bins-we use bins that span one hour.Then, the number of bins with at least one communication event is computed.These numbers are used as link weights for egocentric networks (panel B).For the mixed networks, the link weights represent the number of bins where either calls, texts, or both are taking place.

FIG. 2 :
FIG. 2: Social signatures are persistent at the individual level.This holds for both channels (calls and texts) as well as mixed signatures combining both.Panels (a), (c), (e): Dataset 1 (the large dataset), Panels (b), (d), (f): Dataset 2 (students).The distributions of distances between social signatures of each ego in two consecutive equal-sized intervals are shown in blue (self-distances).The reference distributions of distances between signatures of different egos are shown in red (reference distances).Comparing the distributions of self-distances with reference distances verifies the persistence of call, text and mixed signatures, as self-distances are on average smaller than reference distances.

FIG. 3 :
FIG. 3: The similarities of social signatures of different types.Panel (a) shows the call, text and mixed signatures of one person in the Dataset 1.The three signatures look similar.Panel (b) illustrates the average signatures over the population in Dataset 2. The population-level signatures are also fairly similar.Panels (c) and (d) compare the distance distributions of the call and text signatures of same egos with the distributions of call and text signatures of different people as a reference.The call and text signatures of each ego are more similar than pairs of signatures of different people.

FIG. 4 :
FIG. 4: Although the shapes of call and text signatures of an ego are relatively similar to each other, the egocentric networks formed through different channels are different in the membership and ranking of alters.The distribution of Jaccard indices between the sets of call and text alters are shown in (a) for DS1 and (b) for DS2.The distribution of correlation coefficients between call ranks and text ranks of those alters who are in both networks is shown in (c) for DS1 and in (d) for DS2.Alters who are only in one of the networks are not considered.

FIG. 5 :
FIG.5: Top panel: an example mixed signature one of the egos in DS2.Bottom panel: fractions of calls and texts for the same ego, for each rank.The red and blue areas of the bars represent the fractions of texts and calls, respectively, of the link weight between the ego and the alter.The purple areas represent the fraction of time slots with both calls and texts.

FIG. 6 :
FIG. 6: Top (a): Each dot shows the fraction of text slots in an ego-alter relationship as a function of rank, for the smaller DS2.The plot contains all ego-alter pairs.There appears to be no general pattern, except that the top ranks are mostly occupied by alters who are both called and texted, while in the tails of the signature (ranks >10 or so), there are more alters who are only texted or called.Bottom (b): A heat-map version of the top panel for DS1, with colors indicating the number of ego-alter pairs with a given fraction of texts at each rank in DS1.In this dataset, texts are used much less than in DS2.

TABLE I :
The two data sets used in this study.NCPM = number of calls per user per month; NTPM = number of text messages per user per month.