Skip to main content

Table 3 Description of features, grouped by class, showing if they were used in the size prediction regression model, the node placement classifier, or both

From: On the challenges of predicting microscopic dynamics of online conversations

Feature Task Description
Global
Current size Node Number of nodes and its log
Depth Both Conversation depth and its log
Root edges Size Percentage of edges connected to the root
Recent edges Size Percentage of edges connected to the most recent node (the last one) at each insertion
Parent delay Size Mean and median parent delay of nodes (cf. “Structural features” section)
Node
Node index Node Time-ordered index of given node
Node depth Node Depth of given node
Node parent delay Node Parent delay of given node (cf. “Structural features” section)
Subtree size Node Size of subtree of given node
#Children Node Number of children of given node
#Grandchildren Node Number of grandchildren of given node
#Siblings Node Number of siblings of given node
#Cousins Node Number of cousins of given node
#Uncles Node Number of uncles of given node
User (mean and median over conversations initiated by the user)
Max breadth Node Maximum breadth
Depth Node Conversations depth
\(1{\text {st}}\) Breadth Node Number of comments at the first level (depth=1)
Lifetime Node Conversation lifetime in seconds
Size Node Number of nodes in a conversation tree
Root edges Node Percentage of edges connecting to the root
Recent edges Node Percentage of edges connecting to the most recent node (the last one) at each insertion
#Posts Node Total number of posts by user
Content
Root text Both Title of the Reddit post or Twitter text embedded in a 25-component vector
Temporal
Response time Size Mean and median time between replies
\(1{\text {st}}\) Response Time Size Time between root post and first reply
Root time of day Both Time of day of root post
Root day of week Both Day of week of root post