Reddit comment analysis: sentiment prediction and topic modeling using VADER and BERTopic

Authors

DOI:

https://doi.org/10.51359/2965-4661.2024.265074

Keywords:

Sentiment Analysis, text mining, Exploratory Data Analysis, Reddit, topic modelling

Abstract

This work aims at exploring data analysis techniques applied to the social media platform Reddit, highlighting the execution of an Exploratory Data Analysis (EDA) to identify trends and patterns of interaction among users. For sentiment analysis of the comments, the VADER model ("Valence Aware Dictionary and Sentiment Reasoner") is used, and topic modeling is performed with BERTopic ("Bidirectional Encoder Representations from Transformers for Topic Modeling"). The goal is to compare the accuracy and effectiveness of the models in classifying emotions and themes expressed in the comments. The comparison of the models allows identifying which approach yields the most accurate results, which is aligned with the context of discussions on Reddit, providing valuable insights into user behavior and preferences.

References

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805.

Giachanou, A., & Crestani, F. (2016). Like it or not: A survey of Twitter sentiment analysis methods. ACM Computing Surveys, 49(2), 1-41.

Hutto, C. J., & Gilbert, E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Proceedings of the International AAAI Conference on Web and Social Media, 8(1).

Kouloumpis, E., Wilson, T., & Moore, J. (2011). Twitter sentiment analysis: The good the bad and the omg! Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media.

Published

2024-12-20

Issue

Section

Research Articles

Similar Articles

<< < 1 2 3 > >> 

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)