Fig. 1From: Explaining classification performance and bias via network structure and sampling techniqueClassification on networks. This example describes the collective classification workflow. (1) A network \(G=(V,E,C)\) is given, and it is defined by a set of nodes V, edges E, and class labels C. Nodes are known to belong to a binary class \(color\in \{white,black\}\), and no additional attributes are given. Therefore, the goal of the collective classification is to infer the correct class label of nodes. To achieve this (2) a set of seed nodes is sampled, together with their class labels, to create a subgraph \(G_\text {seeds}\subset G\). Then, (3) the local model learns the probability of each class label, e.g., \(P(x = black) = 0.5\), while the relational model learns the probability of neighbor \(v_j\) having a particular class label conditioned on the class label of a node \(v_i\), e.g., \(P(x_{j}=white|x_{i}=black)=0.66\). Finally, in (4) the collective inference process assigns posterior probabilities to each unlabeled node using their 1-hop neighbors. For example, the probability that node A is black (or white) is conditioned on the color of its neighbors B and D. Notice that the inference is performed collectively – i.e., at the same time for all nodesBack to article page