Skip to main content

Table 2 Benchmarking of Markov Stability clusters versus LDA topics at different levels of resolution

From: From free text to clusters of content in health records: an unsupervised graph partitioning approach

 

Similarity to hand-coded categories (NMI)

Topic Coherence (\(\widehat {PMI}\))

No. of topics/clusters

LDA

MS

LDA

MS

3

0.311

0.267

2.991

3.033

7

0.409

0.393

3.218

3.303

12

0.361

0.398

3.270

3.517

17

0.390

0.401

3.419

3.457

44

0.395

0.388

3.549

3.716

  1. Scores for similarity to hand-coded categories (NMI) and topic coherence (\(\widehat {PMI}\)) for the five MS resolutions highlighted in the main text and their corresponding LDA models. Boldface identifies the best computational result