Table 1 Features extracted from Pushshift API together with processed features

From: Dank or not? Analyzing and predicting the popularity of memes on Reddit

  Feature Type Description
1 created_utc utc timestamp Time of post submission
2 ups integer Number of upvotes received
3 is_nsfw Boolean Indicates if only suitable for 18+
4 subreddit String Subreddit of the submission
5 subscribers Integer Number of subscribers to the subreddit
6 thumbnail.height Floating point value Height of the thumbnail
7 thumbnail.thumbnail String Thumbnail media
8 thumbnail.widith Floating point value Width of thumbnail
9 title string Title of the submission
10 media String Link to associated meme media
11 ups_normed Floating point value ups normalized with subscribers
12 dankornot Integer Label ups_normed for binary classification
13 processed_words List of strings Filtered and stemmed words from title and image
14 word_count Integer Number of words in title and image
15 TextLength Integer Number of characters in title
16 Sentiment Floating point value Text valence score
17 avg_H Floating point value Average HSV hue value of meme
18 avg_S Floating point value Average HSV saturation value of meme
19 avg_V Floating point value Average HSV value value of meme
20 30 colors Floating point value Normalized pixels of color in image
21 VGG_features List of strings VGG-16’s first three guesses about image content
22 VGG_probs List of floating point values The probabilities of the VGG-16’s first three guesses