Leveraging textual information for social media news categorization and sentiment analysis

利用文本信息进行社交媒体新闻分类和情感分析

阅读:1

Abstract

The rise of social media has changed how people view connections. Machine Learning (ML)-based sentiment analysis and news categorization help understand emotions and access news. However, most studies focus on complex models requiring heavy resources and slowing inference times, making deployment difficult in resource-limited environments. In this paper, we process both structured and unstructured data, determining the polarity of text using the TextBlob scheme to determine the sentiment of news headlines. We propose a Stochastic Gradient Descent (SGD)-based Ridge classifier (RC) for blending SGDR with an advanced string processing technique to effectively classify news articles. Additionally, we explore existing supervised and unsupervised ML algorithms to gauge the effectiveness of our SGDR classifier. The scalability and generalization capability of SGD and L2 regularization techniques in RCs to handle overfitting and balance bias and variance provide the proposed SGDR with better classification capability. Experimental results highlight that our string processing pipeline significantly boosts the performance of all ML models. Notably, our ensemble SGDR classifier surpasses all state-of-the-art ML algorithms, achieving an impressive 98.12% accuracy. McNemar's significance tests reveal that our SGDR classifier achieves a 1% significance level improvement over K-Nearest Neighbor, Decision Tree, and AdaBoost and a 5% significance level improvement over other algorithms. These findings underscore the superior proficiency of linear models in news categorization compared to tree-based and nonlinear counterparts. This study contributes valuable insights into the efficacy of the proposed methodology, elucidating its potential for news categorization and sentiment analysis.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。