Fig. 4From: From free text to clusters of content in health records: an unsupervised graph partitioning approachSummary of the 44-community found with the MS algorithm in an unsupervised manner directly from the text of the incident reports, as seen in Fig. 3. To interpret the 44 content communities, we have compared them a posteriori to the 15 external, hand-coded categories (indicated by names and colours). This comparison is presented in two equivalent ways: through a Sankey diagram showing the correspondence between categories and communities (left); and through a normalised contingency table based on z-scores (right). The communities have been assigned a content label based on their word clouds presented in Figure Additional file 1 in the SIBack to article page