Skip to main content

Table 4 Model performance (AUC) for both vax-skeptic and pro-vaxxer content prediction with different feature sets (Text, User history, Network statistics, Raw network)

From: Network embedding aided vaccine skepticism detection

Text

User history

Network statistics

Raw network

AUC

T-test

Score

Gain (%)

t

p

(a) Vax-skeptic content prediction

 \(\checkmark\)

   

0.810

19.983

9.2e−22

 \(\checkmark\)

\(\checkmark\)

  

0.840

3.7

9.645

9.2e−12

 \(\checkmark\)

 

\(\checkmark\)

 

0.832

2.7

16.330

9.1e−19

 \(\checkmark\)

  

\(\checkmark\)

0.886

9.3

 \(\checkmark\)

\(\checkmark\)

\(\checkmark\)

\(\checkmark\)

0.887

9.5

−0.825

0.414

(b) Pro-vaxxer content prediction

 \(\checkmark\)

   

0.753

14.038

1.2e−16

 \(\checkmark\)

\(\checkmark\)

  

0.786

4.4

6.313

2.1e−07

 \(\checkmark\)

 

\(\checkmark\)

 

0.761

1.1

10.986

2.3e−13

 \(\checkmark\)

  

\(\checkmark\)

0.812

7.8

 \(\checkmark\)

\(\checkmark\)

\(\checkmark\)

\(\checkmark\)

0.816

8.3

−0.618

0.540

  1. The performance gain is shown over classification based on text only. Here, raw network represents Walklets (Perozzi et al. 2016), the best-performing node embedding model as seen in Fig. 11. Columns on the right report results of T-tests against text + raw network using 20 independent training-testing samples