Untangling the role of diverse social dimensions in the diffusion of microfinance

Ties between individuals on a social network can represent different dimensions of interactions, and the spreading of information and innovations on these networks could potentially be driven by some dimensions more than by others. In this paper we investigate this issue by studying the diffusion of microfinance within rural India villages and accounting for the whole multilayer structure of the underlying social networks. We define a new measure of node centrality, diffusion versatility, and show that this is a better predictor of microfinance participation rate than previously introduced measures defined on aggregated single-layer social networks. Moreover, we untangle the role played by each social dimension and find that the most prominent role is played by the nodes that are central on layers concerned with trust, shedding new light on the key triggers of the diffusion of microfinance.


Introduction
Understanding the mechanisms driving the diffusion of information, behaviours, and innovations is a question of great interest for social and economical sciences (Bond et al. 2012;Coleman et al. 1957;Rogers 1962). In his seminal book, Rogers identified four key elements for the diffusion of an innovation: the characteristics of the innovation itself, the communication channels, time, and the social systems within which the diffusion occurs (Rogers 1962). The role played by the social structure of the system has since then been widely investigated using the mathematical formalism of networks (Valente 1995;Watts 2002), and a fundamental question has been the identification of the most influential individuals therein (Freeman 1979;Kitsak et al. 2010). This is especially important in the context of network interventions, which is concerned with understanding how social networks influence behaviours and their diffusions (Valente 2012). In particular, induction interventions are designed to stimulate peer-to-peer interaction to trigger cascades in information or behavioural diffusion. Studies have shown that their success is critically dependent on the choice of influencers (Valente and Davis 1999) but also on their position in the network (Aral et al. 2013).
In this paper, by taking advantage of the framework of multilayer networks, we investigate how the choice of opinion leaders can be improved in the context of the diffusion of microfinance in rural villages. Building on the seminal study of Banjeree et al. (2013), we rely on a unique dataset on social network structure and participation in microfinance of 43 villages in Karnataka, a state of southern India 1 . In-between 2007 and 2011 a microfinance institution, Bharatha Swamukti Samsthe (BSS), entered these villages, which previously had almost no exposure to any microfinance institution nor other types of formal credit. Before BSS's entrance in the villages, Banjeree and collaborators administered to households detailed surveys covering a wide range of interactions, to reconstruct the structure of the social network. When entering a village, BSS selected a number of pre-defined individuals that they would expect to be well connected within the villages (teachers, shopkeepers, leaders of self-help groups, etc), and had a private meeting with them to introduce the microfinance programme. These individuals, hereafter simply called leaders, then played a fundamental role in spreading the information about microcredit opportunities. Banerjee and collaborators investigate the correlation between the village level of participation in microfinance and the average centrality of its leaders in the social network.
Their goal is to find the centrality measure that best predicts participation, so that in future interventions the most central individuals in the network could be selected as leaders to potentially maximise participation. To the best of our knowledge, no study other than (Banerjee et al. 2013) exists on applying the ideas of network interventions in the context of microfinance, the choice of the opinion leaders being left to credit institution criteria such as those just mentioned.
Banerjee and collaborators define, for each village, a social network of households as an undirected unweighted network linking two households if any of their members are in at least one of the relations covered by the survey. They then introduce a new measure, called diffusion centrality, to evaluate the importance of households within the network, with the ultimate goal of predicting the rate of village participation in microfinance on the basis of the centrality of the households that were firstly informed about it. Given the network adjacency matrix A, a passing probability q and T iterations, the diffusion centrality of node i is the i th entry of the vector where q is set as the inverse of the first eigenvalue of the adjacency matrix, and T to the number of trimesters during which the village was exposed to BSS (6.6 on average). Essentially, diffusion centrality measures how effective a household would be as injection point of a new piece of information. By means of multivariate linear regression (including 5 village-level controls, i.e. number of households, self-help group participation rate, savings participation rate, caste composition, and fraction of village households designated as leaders), they show that the average diffusion centrality of the pre-selected leaders outperforms other existing measures of centrality in predicting the village eventual rate of participation in microfinance.
The administered surveys, used to reconstruct the social network, cover 8 different dimensions: names of those whose homes the respondent visits or receives visits by, kins in the village, nonrelatives with whom the respondent socializes, those from whom the respondent receives medical help, those from which and to whom the respondent would borrow or lend money, those from which and to whom the respondent would borrow or lend material goods (such as kerosene or rice), those from or to whom the respondent gets or gives advice, and those with whom the respondent goes to pray (at a temple, church, or mosque). In this paper, we show that taking into account the multilayer structure emerging from the different dimensions covered by the surveys leads to an improved prediction of microfinance participation. Moreover we investigate the relative role played by the different kinds of tie. These results can be used in future network interventions in the context of microfinance, and beyond, to select opinion leaders in function of their position in the multilayer network, so to maximise participation in the programme. The study is motivated by the recent growing literature on multiplex networks showing that taking into account the multilayer structure of social networks -which consist of different kinds of ties, from kinship, to friendship and professional relations (Wasserman and Faust 1994) -can shed new light into its topological and dynamical properties (Kivelä et al. 2014). Therefore in this paper we reconsider the question of how innovations diffuse by asking: do all kinds of tie play the same role or are some dimensions more influential than others in fostering the adoption of an innovation?

Data
In each village, about half of the households completed surveys in which each member was asked to list the names of people in the village with whom they had a certain relationship. Households were selected through random sampling and stratification by religion and geographic sub-regions. For further information about data collection we refer the reader to the original paper (Banerjee et al. 2013), and the publicly available dataset (2013). Individuals were asked the following questions: 1. Name the 4 non-relatives whom you speak to the most. 2. In your free time, whose house do you visit? 3. Who visits your house in his or her free time? 4. If you needed to borrow kerosene or rice, to whom would you go? 5. Who would come to you if he/she needed to borrow kerosene or rice? 6. If you suddenly needed to borrow Rs. 50 for a day, whom would you ask? 7. Who do you trust enough that if he/she needed to borrow Rs. 50 for a day you would lend it to him/her? 8. Who comes to you for advice? 9. If you had to make a difficult personal decision, whom would you ask for advice? 10. If you had a medical emergency and were alone at home whom would you ask for help in getting to a hospital? 11. Name any close relatives, aside those in this household, who also live in this village. 12. Do you visit temple/mosque/church? Do you go with anyone else? What are the names of these people?
We observe that some pairs of questions are symmetric, as for instance "In your free time, whose house do you visit?" and "Who visits your house in his or her free time?". The two questions, jointly considered, allow to reconstruct a network describing who visits whom within each village. The same stands for questions 4-5, 6-7 and 8-9, which allow to reconstruct, respectively, the network of potential material good loans, of potential money loans, and of advice relationships. Therefore, from the 12 questions we identify 8 different dimensions: nonrelative socialisation (1), house visits (2-3), material good potential loans (4-5), money potential loans (6-7), advice exchange (8-9), help in a medical emergency (10), kinship (11), and praying company (12).

Methods
The social network defined by Banerjee and collaborators is the product of an aggregation over different types of social ties, from kinship to medical help. It was recently shown that accounting for the whole multilayer structure of networks that are intrinsically composed of different kinds of relations has important consequences in the definition of the most central nodes, and allows to identify the more versatile ones (De Domenico et al. 2015). We call this extended notion of centrality versatility. Here, we are interested in understanding if measuring leaders' versatility in a multilayer network that accounts for all dimensions separately can improve the prediction of microfinance participation. To this end, for each village we build a multilayer network composed of N nodes, corresponding to the number of households in the village, and L = 8 layers, each encoding one of the dimensions defined above. Moreover, each node on a given layer is connected to its replica on all the other layers. Figure 1 shows the visualisation of the multilayer social network for one of the villages. Following the mathematical framework introduced in This was shown to be a natural generalisation of the adjacency matrix, and allows for a simple mathematical definition of multilayer networks, as we will now describe.
Let us first consider a standard network, composed of N nodes and of only one single type of edge. Such graph can be represented by means of the adjacency matrix where w ij indicates the intensity of the relationship between node i and node j, e i is the canonical vector in the vector space R N , that is the i th component of e i is 1, and all of its other components are 0, and † is the transposition operator, which transforms the column vector e j into a row vector. E ij = e i ⊗ e † j is the 2 nd -order (i.e. rank-2) canonical tensor defined as the tensor product ⊗ of the two canonical vectors.
Let us now introduce the language of tensors, that we need to generalise the notion of adjacency matrix to the more general notion of adjacency tensor needed to describe multilayer networks. We will use the covariant notation, in which a row vector v ∈ R N is given by a covariant vector v α (where α = 1, 2, . . . , N), and the corresponding column vector v † is given by the contravariant vector v α . Moreover, we will use Latin letters to denote the i th vector or the (ij) th tensor, and Greek letters to indicate the components of a vector or a tensor. Using this notation, e α (i) is the α th component of the i th covariant canonical vector e i in R N , and e α (j) is the α th component of the j th contravariant canonical vector in R N . The adjacency matrix W can now be represented as rank-2 adjacency tensor W α β (1-covariant and 1-contravariant) as a linear combination of tensors in the canonical basis where E α β (ij) ∈ R N×N indicates the tensor in the canonical basis corresponding to the tensor product of the canonical vectors assigned to nodes i and j, i.e. it is E ij .
In a multilayer network, each type of relation between nodes is embedded in a different layerk (wherek = 1, 2, . . . , L and we use the tilda symbol to denote indices that correspond to layers). For each of the layers, we construct the intra-layer adjacency tensor W α β (k) encoding information about relations between nodes within the same layer k. Moreover, to encode information about connections between nodes in different layers, we construct the inter-layer adjacency tensors C α β (hk). Note that, whenh =k, we retrieve the intra-layer adjacency tensors C α β (kk) = W α β (k). Following the same approach as above, we define the covariant and contravariant vectors eδ(k) and eγ (h) (whereδ,γ , k,h all range in (1, 2, . . . , L)) of the canonical basis in the space R L . From these, we construct the 2 nd -order tensors Eγ δ (hk) = eγ (h)eδ(k) that represent the canonical basis of the space R L×L . Finally, we can now write the multilayer adjacency tensor as the tensor product between the adjacency tensors C α β (hk) and the canonical tensors Eγ δ (hk): where w ij (hk) are scalars that indicate the existence or not of a relationship between nodes i and j, and E αγ βδ (ijhk) ≡ e α (i)e β (j)eγ (h)eδ(k) is the 4 th -order (i.e., rank-4) tensors of the canonical basis in the space R N×N×L×L . In our particular case, we define w ij (hk) as follows. We set w ij (kk) = 1 if there exists at least one member of household i that indicated a relationship of typek with any member of household j, or vice-versa, wherek refers to any of the socio-economic dimensions defined above. Moreover, to take into account the fact that the L replicas of node i, one per layer, represent in fact the same household, we set w ii (hk) = 1 for all i = 1, 2, . . . , N and all pairs of layers (hk). All others w ij (hk) are set equal to 0.
We then generalise the definition of diffusion centrality by considering a diffusion process on the multilayer network, and introduce a new metrics that we call diffusion versatility. We define the layer-dependent diffusion versatility of node α in layerγ as the (αγ ) th component of the rank-2 tensor is the t-th power of the rank-4 tensor, and u βδ = L h=1 N i=1 e β (i)eδ(h) is the N × L rank-2 tensor with all components equal to 1. We then obtain the diffusion versatility of node α independently of the layer by contracting the index of the tensor with the contravariant vector uγ whose entries are all equal to 1, and normalising by dividing by L: Let us note that the layer-dependent diffusion versatility DV αγ (A αγ βδ ; q, T) is not equivalent to computing diffusion centrality on a network composed only by layer α, because here we are taking into account the whole multilayer network in its computation. Therefore diffusion versatility DV α (A αγ βδ ; q, T) is not equivalent to computing diffusion centrality on the single layers separately and then taking their average for each node.
Conceptually, the diffusion versatility of a node measures how far a diffusion process starting on the node can spread on the multilayer network in a given amount of time T (in our case, the number of trimesters during which the village was exposed to the microfinance institution). Accounting for the whole multilayer structure allows to capture along which kind of ties the diffusion is more likely to take place, and to assess whether the importance of nodes in the network as seeds of a diffusion process is more dependent on a dimension or another. For instance, a household that is very central in the aggregated network because it has several kinship ties with other households in the village, might have lower diffusion versatility in the multilayer network than another household that has the same centrality in the aggregated network but whose ties span over different dimensions because there live a very trusted person to whom people go to ask for advice, money and material goods.

Comparing centrality and versatility node rankings
First, we show that ranking nodes according to their diffusion versatility is significantly different than ranking them according to their diffusion centrality in the aggregated network. Figure 2 shows a density map of the two rankings, for the 100 top ranked nodes in each village (i.e. about half of the nodes, on average). We selected the top 100 to avoid biases in the rank comparison due to the fact that pairs or groups of less central (or versatile) nodes might present the same value of centrality (or versatility) and therefore the same rank cardinal number, thus biasing the comparison between two different rankings. We show that most nodes do not occupy the same position in the two rankings, with 28% of them presenting a difference greater than or equal to 10, suggesting that diffusion versatility provides different information with respect to diffusion centrality We observe that the two rankings are positively correlated as expected (using the multilayer network structure should capture some different aspects but not drastically change the whole ranking), but also that indeed the ranking is significantly different for several nodes. More specifically, 96% of the nodes do not occupy the same position in the two rankings, and 28% of them present a rank difference greater than or equal to 10. This result suggests that diffusion versatility provides different information with respect to diffusion centrality, and in the following sections we explore whether this information can lead to a better prediction of microfinance participation, and, more importantly, to the detection of which kinds of tie play the most important role.

Improving microfinance participation prediction
We investigate the correlation between the average diffusion versatility of leaders (as defined in Eq. 6) and the rate of microfinance participation in the village, and compare the results with those obtained using diffusion centrality. As shown in Table 1, Values shown are coefficients from ordinary least-squares regression. Each column represents a different regression. The dependent variable is the microfinance participation rate of nonleader households in a village. The covariates are diffusion centrality (regression 1) and diffusion versatility (regression 2), averaged over the set of leaders, as well as 5 control variables: number of households, self-help group participation rate, savings participation rate, caste composition, and fraction of village households designated as leaders. Standard errors (in parenthesis) are robust to heteroskedasticity we find that diffusion versatility is more strongly correlated to microfinance participation rate than diffusion centrality (R 2 = 0.470 for versatility, versus R 2 = 0.442 for centrality).
To test the significance of the difference between the two models, we generate 1000 bootstrapped samples of the data, perform the linear regressions on them, and then compare the two resulting distributions of the coefficient of determination using the paired samples t-test. We find that we can accept with a 99% confidence level the alternative hypothesis that the the average R 2 of the model that uses versatility is higher than the average R 2 of the model that uses centrality. These results show that accounting for the whole multilevel structure of the different dimensions provides a better framework to identify the pre-defined set of leaders that microfinance agencies should initially inform in order to maximise participation. However, given that the improvement in prediction is significant but relatively small, we are interested in understanding if some kinds of tie play a more fundamental role than others in the diffusion, and leaders should therefore be chosen according to their layer-dependent versatility in some particular layers.

Untangling the importance of the different dimensions
We investigate whether the diverse dimensions contribute evenly, or rather play different roles, by considering the layer-dependent components of the diffusion versatility tensor, i.e. DV αγ (A αγ βδ ; q, T). For each dimension, we compute the average leaders' versatility taking into account only the components of the corresponding layerγ , thus obtaining 8 different average leaders' versatilities, each corresponding to a given dimension. Let us note that this is not the same as computing diffusion centrality on each layer separately, because in this case each versatility value is computed taking into account the whole multilayer structure. We perform 8 linear regressions, each using as covariate one of the 8 versatility measures (as well as the same control variables as above), and microfinance participation as the dependent variable. The results are reported in Table 2, from the least to the most predictive, as indicated by R 2 values.
To assess the statistical significance of the difference between each of these models and the model based on diffusion centrality, we use paired samples t-test on 1000 bootstrapped samples of the data, as already described in the previous section. We find that we can accept with a 99% confidence level the alternative hypothesis that the average R 2 of the the models that use layer-dependent versatility based on the layers material good, kinship, praying company, advice, money and medical help is higher than the average R 2 of the model that uses centrality. For the model based on the nonrelative socialisation the confidence level is 90%. Instead, the average R 2 of the the model based on the visits layer is smaller than the average R 2 of the model that uses centrality (99% confidence level). Moreover, we find that we can accept with a 99% confidence level the alternative hypothesis that the average R 2 of the the models that use layer-dependent versatility based on the layers money and medical help is also higher than the average R 2 of the model that uses overall versatility. The same holds also for the advice layer, but with a confidence level of 90%. These results indicate that the most predictive dimensions are all related to trust: asking for help in a medical emergency, asking for money if in need, and asking for advice. These results mean that the versatility of leaders in these layers is what best correlates with the final rate of participation in microfinance in the village. This could serve as an indication for microfinance institutions for leader selection, which could be done on the basis of diffusion versatility, but with a particular focus on individuals belonging to households which are particularly versatile on these specific layers.

Conclusions
In this paper we have shown that taking into account the multilayer structure of social networks of rural India villages allows for a better identification of the individuals who are more likely to help the spreading of microfinance in the community. Firstly, we have introduced a new measure, diffusion versatility, as an extension of diffusion centrality to multilayer networks. We have shown that the diffusion versatility of leaders is a better predictor of the microfinance participation rate in the village than diffusion centrality. Secondly, we have used the layer-dependent components of diffusion versatility to untangle the role played by each dimension in the diffusion of microfinance. We have found that the most predictive dimensions are related with trust: asking for help in a medical emergency or for a money loan if in need. These results show that diffusion versatility could be used by microfinance institutions to identify opinion leaders so to maximise participation, focusing in particular on those with high versatility in specific layers. Further field research could validate these results, for instance by means of randomised field experiments. Leaders in a set of villages could be chosen according to their layer-dependent diffusion versatility ranking relative to a given dimension, and in another set of villages according to a different dimension, and then compare participation. Moreover, future work should involve sociologists and anthropologists in order to combine methods of multilayer network analysis with detailed investigations of the sociological meaning of the different dimensions in the context of rural India, to gain a deeper understanding of these social systems and how innovations diffuse therein. Endnote 1 http://web.stanford.edu/~jacksonm/Data.html