Skip to main content


Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Table 4 Clustering comparison functions implemented in CDLIB

From: CDLIB: a python library to extract, compare and evaluate communities from complex networks

amiAdjusted Mutual Information is an adjustment of the Mutual Information score to account for chance.\(\frac {(X, Y) - E(MI(X, Y))} {max(H(X), H(Y)) - E(MI(X, Y))}\)(Vinh et al. 2010)
ariThe Rand Index computes a similarity measure between two clusterings by considering all pairs of samples and counting pairs that are assigned in the same or different clusters in the predicted and true clusterings.\(\frac {RI - Expected_{RI}}{max(RI - Expected_{RI})}\)(Hubert and Arabie 1985)
closenessCloseness of community size distributions.\(\frac {1}{2} \sum _{i=1}^{r} \sum _{j=1}^{s} \min \left \{\frac {x_{a}\left (n^{i}_{a}\right)}{N^{a}}, \frac {x_{b}\left (x^{j}_{b}\right)}{N_{b}}\right \} \delta _{1}\left (n^{i}_{a}, n^{j}_{b}\right)\)(Dao et al. 2018)
f1Average F1 score (harmonic mean of Precision and Recall) of the optimal matches among the partitions in input. clustering.\(2 \times \frac {precision \times recall} {precision + recall}\)(Rossetti et al. 2016)
nf1Normalized version of F1 that corrects the resemblance score taking into account degree of node overlap and clutering coverage.\( \frac {F1\times Coverage}{Redundancy}\)(Rossetti 2017),(Rossetti et al. 2016)
nmiNormalized Mutual Information (NMI) is an normalization of the Mutual Information (MI) score to scale the results between 0 (no mutual information) and 1 (perfect correlation)\(\frac {H(X) + H(Y) - H(X, Y)}{(H(X) + H(Y))/2}\)(Lancichinetti et al. 2009)
onmi-LFKOriginal extension of the Normalized Mutual Information (NMI) score to cope with overlapping partitions.\( 1 - \frac {1}{2}\bigg (\frac {H(X|Y)}{H(X)}+\frac {H(Y|X)}{H(Y)}\bigg)\)(Lancichinetti et al. 2009)
onmi-MGHExtension of the Normalized Mutual Information (NMI) score to cope with overlapping partitions, based on max normalization.\( \frac {I(X:Y)}{max(H(X),H(Y))}\)(McDaid et al. 2011)
omegaResemblance index defined for overlapping, complete coverage, clusterings.\( \frac {Obs(s1, s2) - Exp(s1, s2)} {1 - Exp(s1, s2)}\)(Murray et al. 2012)
viVariation of Information among two nodes partitions.H(X)+H(Y)−2MI(X,Y)(Meilă 2007)