Skip to main content

Table 3 Description of features, grouped by class, showing if they were used in the size prediction regression model, the node placement classifier, or both

From: On the challenges of predicting microscopic dynamics of online conversations

Feature

Task

Description

Global

Current size

Node

Number of nodes and its log

Depth

Both

Conversation depth and its log

Root edges

Size

Percentage of edges connected to the root

Recent edges

Size

Percentage of edges connected to the most recent node (the last one) at each insertion

Parent delay

Size

Mean and median parent delay of nodes (cf. “Structural features” section)

Node

Node index

Node

Time-ordered index of given node

Node depth

Node

Depth of given node

Node parent delay

Node

Parent delay of given node (cf. “Structural features” section)

Subtree size

Node

Size of subtree of given node

#Children

Node

Number of children of given node

#Grandchildren

Node

Number of grandchildren of given node

#Siblings

Node

Number of siblings of given node

#Cousins

Node

Number of cousins of given node

#Uncles

Node

Number of uncles of given node

User (mean and median over conversations initiated by the user)

Max breadth

Node

Maximum breadth

Depth

Node

Conversations depth

\(1{\text {st}}\) Breadth

Node

Number of comments at the first level (depth=1)

Lifetime

Node

Conversation lifetime in seconds

Size

Node

Number of nodes in a conversation tree

Root edges

Node

Percentage of edges connecting to the root

Recent edges

Node

Percentage of edges connecting to the most recent node (the last one) at each insertion

#Posts

Node

Total number of posts by user

Content

Root text

Both

Title of the Reddit post or Twitter text embedded in a 25-component vector

Temporal

Response time

Size

Mean and median time between replies

\(1{\text {st}}\) Response Time

Size

Time between root post and first reply

Root time of day

Both

Time of day of root post

Root day of week

Both

Day of week of root post