From: CDLIB: a python library to extract, compare and evaluate communities from complex networks
Name | Description | Formula | Reference |
---|---|---|---|
average_internal_degree | The average internal degree of the community set. | \(\frac {2e_{C}}{n_{C}}\) | |
avg_distance | The average path length across all possible pair of nodes composing the community set. | ||
avg_embeddedness | The average embeddedness of nodes within the community. | \( \frac {1}{n_{C}} \sum _{i \in C} \frac {k^{int}_{iC}}{\Gamma (i)}\) | |
avg_odf | Average fraction of edges of a node of a community that point outside the community itself. | \(\frac {1}{n_{C}} \sum _{u \in C} \frac {|{(u,v)\in E: v \not \in C}|}{\Gamma (u)}\) | |
avg_transitivity | The average clustering coefficient of its nodes w.r.t. their connection within the community itself. | ||
conductance | Fraction of total edge volume that points outside the algorithms. | \(\frac {n_{C}}{2 e_{C}+n_{C}}\) | |
cut_ratio | Fraction of existing edges (out of all possible edges) leaving the community. | \(\frac {n_{C}}{b_{C} (n - b_{C})}\) | |
edges_inside | Number of edges internal to the community. | eC | |
expansion | Number of edges per community node that point outside the cluster. | \(\frac {n_{C}}{b_{C}}\) | |
flake_odf | Fraction of nodes of the clustering that have fewer edges pointing inside than to the outside of their communities. | \(\frac {{ u:u \in C,| {(u,v) \in E: v \in C }| < |\Gamma (u)/2 }|}{n_{C}} \) | |
fraction_over_median_degree | raction of community nodes having internal degree higher than the median degree value. | \(\frac {{u: u \in C,| {(u,v): v \in C}| > |d_{m}}| }{n_{C}}\) | |
hub_dominance | The ratio of the degree of its most connected node w.r.t. the theoretically maximal degree within the community. | \(\frac {max_{i \in C}(k^{int}_{iC})}{n_{C}-1}\) | |
internal_edge_density | The internal density of the community set. | \(\frac {e_{C}}{n_{C}(n_{C}-1)/2}\) | |
max_odf | Maximum fraction of edges of a node of a community that point outside the community itself. | \(max_{u \in C} \frac {|{(u,v)\in E: v \not \in C}|}{\Gamma (u)}\) | |
normalized_cut | Normalized variant of the Cut-Ratio. | \(\frac {n_{C}}{2e_{C}+n_{C}} + \frac {n_{C}}{2(|E|-e_{C})+n_{C}}\) | |
scaled_density | The ratio of the community density w.r.t. the complete graph density. | \(\frac {2|E|}{n-1}\) | |
significance | Estimate the likelihood that the identified partition appears in a random graph. | \(\sum _{c}\binom {n_{C}}{2}D(p_{C} || p)\) | |
size | Number of community nodes. | ||
surprise | Statistical approach that assumes that edges emerge randomly according to a hyper-geometric distribution: the higher the surprise, the less likely the clustering is resulted from a random realization. | \( - log \sum _{i=m_{C}}^{min(|E|,M_{int})}\binom {|E|}{i} \left \langle q \right \rangle ^{i} (1-\left \langle q \right \rangle)^{|E|-i}\) | |
triangle_participation_ratio | Fraction of community nodes that belong to a triad. | \(\frac { | { u: u \in C,{(v,w):v, w \in C,(u,v) \in E,(u,w) \in E,(v,w) \in E} \not = \emptyset } |}{n_{C}}\) | |
erdos_renyi_modularity | Variation of the Newman-Girvan modularity that assumes that nodes in a network connected randomly with a constant probability p. | \(\frac {1}{|E|}\sum _{C \in \{C_{1},\dots C_{k}\}} (e_{C} - \frac {|E|n_{C}(n_{C} -1)}{n(n-1)})\) | |
link_modularity | Variation of the Girvan-Newman modularity for directed graphs with overlapping communities. | \(\frac {1}{|E|}\sum _{i,j \in C}[A_{ij}\delta (c_{i}, c_{j})-\frac {k_{i}C^{out}k_{j}C^{in}}{m}\delta (c_{i}, c_{j})]\) | |
modularity_density | Variation of the Erdos-Renyi modularity that includes information about community sizes into the expected density coefficient so to avoid the negligence of small and dense communities. | \(\sum _{C \in \{C_{1},\dots C_{k}\}} \frac {1}{n_{C}} (\sum _{i \in C} k^{int}_{iC} - \sum _{i \in C} k^{out}_{iC}) \) | |
newman_girvan_modularity | Difference of the fraction of intra community edges of a clustering with the expected number of such edges if distributed according to a null model. | \(\frac {1}{|E|}\sum _{C \in \{C_{1},\dots C_{k}\}}(e_{C} - \frac {(2 e_{C} + l_{C})^{2}}{4|E|})\) | |
z_modularity | Variant of the standard modularity proposed to avoid the resolution limit. | \(\frac {\sum _{C \in \{C_{1},\dots C_{k}\}}\frac {m_{C}}{|E|}-\sum _{C \in \{C_{1},\dots C_{k}\}}\left (\frac {D_{C}}{2|E|}\right)^{2}}{\sqrt {\sum _{C \in \{C_{1},\dots C_{k}\}}\left (\frac {D_{C}}{2|E|}\right)^{2} \left (1- \sum _{C \in \{C_{1},\dots C_{k}\}} \left (\frac {D_{C}}{2|E|}\right)^{2}\right)^{2}}}\) |