Skip to main content

Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Table 3 Fitness functions implemented in CDLIB

From: CDLIB: a python library to extract, compare and evaluate communities from complex networks

NameDescriptionFormulaReference
average_internal_degreeThe average internal degree of the community set.\(\frac {2e_{C}}{n_{C}}\)(Radicchi et al. 2004)
avg_distanceThe average path length across all possible pair of nodes composing the community set. (Orman et al. 2012)
avg_embeddednessThe average embeddedness of nodes within the community.\( \frac {1}{n_{C}} \sum _{i \in C} \frac {k^{int}_{iC}}{\Gamma (i)}\)(Orman et al. 2012)
avg_odfAverage fraction of edges of a node of a community that point outside the community itself.\(\frac {1}{n_{C}} \sum _{u \in C} \frac {|{(u,v)\in E: v \not \in C}|}{\Gamma (u)}\)(Flake et al. 2000)
avg_transitivityThe average clustering coefficient of its nodes w.r.t. their connection within the community itself. (Orman et al. 2012)
conductanceFraction of total edge volume that points outside the algorithms.\(\frac {n_{C}}{2 e_{C}+n_{C}}\)(Shi and Malik 2000)
cut_ratioFraction of existing edges (out of all possible edges) leaving the community.\(\frac {n_{C}}{b_{C} (n - b_{C})}\)(Fortunato 2010)
edges_insideNumber of edges internal to the community.eC(Radicchi et al. 2004)
expansionNumber of edges per community node that point outside the cluster.\(\frac {n_{C}}{b_{C}}\)(Radicchi et al. 2004)
flake_odfFraction of nodes of the clustering that have fewer edges pointing inside than to the outside of their communities.\(\frac {{ u:u \in C,| {(u,v) \in E: v \in C }| < |\Gamma (u)/2 }|}{n_{C}} \)(Flake et al. 2000)
fraction_over_median_degreeraction of community nodes having internal degree higher than the median degree value.\(\frac {{u: u \in C,| {(u,v): v \in C}| > |d_{m}}| }{n_{C}}\)(Yang and Leskovec 2015)
hub_dominanceThe ratio of the degree of its most connected node w.r.t. the theoretically maximal degree within the community.\(\frac {max_{i \in C}(k^{int}_{iC})}{n_{C}-1}\)(Orman et al. 2012)
internal_edge_densityThe internal density of the community set.\(\frac {e_{C}}{n_{C}(n_{C}-1)/2}\)(Radicchi et al. 2004)
max_odfMaximum fraction of edges of a node of a community that point outside the community itself.\(max_{u \in C} \frac {|{(u,v)\in E: v \not \in C}|}{\Gamma (u)}\)(Flake et al. 2000)
normalized_cutNormalized variant of the Cut-Ratio.\(\frac {n_{C}}{2e_{C}+n_{C}} + \frac {n_{C}}{2(|E|-e_{C})+n_{C}}\)(Shi and Malik 2000)
scaled_densityThe ratio of the community density w.r.t. the complete graph density.\(\frac {2|E|}{n-1}\)(Orman et al. 2012)
significanceEstimate the likelihood that the identified partition appears in a random graph.\(\sum _{c}\binom {n_{C}}{2}D(p_{C} || p)\)(Traag et al. 2015)
sizeNumber of community nodes.  
surpriseStatistical approach that assumes that edges emerge randomly according to a hyper-geometric distribution: the higher the surprise, the less likely the clustering is resulted from a random realization.\( - log \sum _{i=m_{C}}^{min(|E|,M_{int})}\binom {|E|}{i} \left \langle q \right \rangle ^{i} (1-\left \langle q \right \rangle)^{|E|-i}\)(Traag et al. 2015)
triangle_participation_ratioFraction of community nodes that belong to a triad.\(\frac { | { u: u \in C,{(v,w):v, w \in C,(u,v) \in E,(u,w) \in E,(v,w) \in E} \not = \emptyset } |}{n_{C}}\)(Yang and Leskovec 2015)
erdos_renyi_modularityVariation of the Newman-Girvan modularity that assumes that nodes in a network connected randomly with a constant probability p.\(\frac {1}{|E|}\sum _{C \in \{C_{1},\dots C_{k}\}} (e_{C} - \frac {|E|n_{C}(n_{C} -1)}{n(n-1)})\)(Erdös and Rényi 1959)
link_modularityVariation of the Girvan-Newman modularity for directed graphs with overlapping communities.\(\frac {1}{|E|}\sum _{i,j \in C}[A_{ij}\delta (c_{i}, c_{j})-\frac {k_{i}C^{out}k_{j}C^{in}}{m}\delta (c_{i}, c_{j})]\)(Nicosia et al. 2009)
modularity_densityVariation of the Erdos-Renyi modularity that includes information about community sizes into the expected density coefficient so to avoid the negligence of small and dense communities.\(\sum _{C \in \{C_{1},\dots C_{k}\}} \frac {1}{n_{C}} (\sum _{i \in C} k^{int}_{iC} - \sum _{i \in C} k^{out}_{iC}) \)(Li et al. 2016)
newman_girvan_modularityDifference of the fraction of intra community edges of a clustering with the expected number of such edges if distributed according to a null model.\(\frac {1}{|E|}\sum _{C \in \{C_{1},\dots C_{k}\}}(e_{C} - \frac {(2 e_{C} + l_{C})^{2}}{4|E|})\)(Newman and Girvan 2004)
z_modularityVariant of the standard modularity proposed to avoid the resolution limit.\(\frac {\sum _{C \in \{C_{1},\dots C_{k}\}}\frac {m_{C}}{|E|}-\sum _{C \in \{C_{1},\dots C_{k}\}}\left (\frac {D_{C}}{2|E|}\right)^{2}}{\sqrt {\sum _{C \in \{C_{1},\dots C_{k}\}}\left (\frac {D_{C}}{2|E|}\right)^{2} \left (1- \sum _{C \in \{C_{1},\dots C_{k}\}} \left (\frac {D_{C}}{2|E|}\right)^{2}\right)^{2}}}\)(Miyauchi and Kawase 2016)
  1. The upper part of the table groups all the community-wise fitness scores for which the library allows to compute the overall distribution as well as standard statistical indexes. The lower part of the table groups the implemented modularity-based quality scores