Network embedding aided vaccine skepticism detection

Applied Network Science

Table 4 Model performance (AUC) for both vax-skeptic and pro-vaxxer content prediction with different feature sets (Text, User history, Network statistics, Raw network)

Text	User history	Network statistics	Raw network	AUC		T-test
Text	User history	Network statistics	Raw network	Score	Gain (%)	t	p
(a) Vax-skeptic content prediction
\(\checkmark\)				0.810	–	19.983	9.2e−22
\(\checkmark\)	\(\checkmark\)			0.840	3.7	9.645	9.2e−12
\(\checkmark\)		\(\checkmark\)		0.832	2.7	16.330	9.1e−19
\(\checkmark\)			\(\checkmark\)	0.886	9.3	–	–
\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	0.887	9.5	−0.825	0.414
(b) Pro-vaxxer content prediction
\(\checkmark\)				0.753	–	14.038	1.2e−16
\(\checkmark\)	\(\checkmark\)			0.786	4.4	6.313	2.1e−07
\(\checkmark\)		\(\checkmark\)		0.761	1.1	10.986	2.3e−13
\(\checkmark\)			\(\checkmark\)	0.812	7.8	–	–
\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	0.816	8.3	−0.618	0.540

The performance gain is shown over classification based on text only. Here, raw network represents Walklets (Perozzi et al. 2016), the best-performing node embedding model as seen in Fig. 11. Columns on the right report results of T-tests against text + raw network using 20 independent training-testing samples