The risk of node re-identification in labeled social graphs

Applied Network Science

Table 2 Basic statistics of generated ERGM networks, and the population of node pairs

Network	ERGM	d	C	r	κ	\|S\| (millions)
polblogs	dc	0.02	0.03	.08	2.52	5.5
	cc	0.02	0.33	-0.02	2.69	13.1
	apl	0.02	0.10	-0.06	2.49	11.5
fb-caltech	dc	0.06	0.08	0.11	2.13	1.2
	cc	0.06	0.42	-0.06	2.73	4.1
	apl	0.06	0.07	0.11	1.97	1.2
fb-dartmouth	dc	0.01	0.17	0.07	2.66	14.5
	cc	0.01	0.24	0.04	2.77	13.2
	apl	0.01	0.20	0.04	2.70	14.2
fb-michigan	dc	0.003	0.02	0.12	3.28	38.4
	cc	0.002	0.20	0.12	3.52	39.9
	apl	0.002	0.20	0.12	3.64	38.2
pokec-1	dc	2.02E-5	0.06	-0.04	5.60	29.5
	cc	2.05E-5	0.07	-0.04	5.84	29.3
	apl	2.04E-5	0.06	-0.04	5.63	27.3
amazon-products	dc	1.82E-5	0.37	-0.06	11.86	43.7
	cc	1.82E-5	0.40	-0.06	13.52	72.5
	apl	1.82E-5	0.39	-0.06	13.47	74.3

Note that dc,cc and apl define the set of parameters that used to generate ERGM graphs based on assortativity (degree correlation), clustering coefficient, and average path length, respectively. We generated a total of ≈ 500 million identical and non-identical node pairs over three ERGM graph spaces of the six real social network datasets. S is the population of generated node pairs concerning a given graph topology