Transformer and Pre-Transformer Model-Based Sentiment Prediction with Various Embeddings: A Case Study on Amazon Reviews

基于Transformer和Pre-Transformer模型的多种词嵌入的情感预测:以亚马逊评论为例

阅读:1

Abstract

Sentiment analysis is essential for understanding consumer opinions, yet selecting the optimal models and embedding methods remains challenging, especially when handling ambiguous expressions, slang, or mismatched sentiment-rating pairs. This study provides a comprehensive comparative evaluation of sentiment classification models across three paradigms: traditional machine learning, pre-transformer deep learning, and transformer-based models. Using the Amazon Magazine Subscriptions 2023 dataset, we evaluate a range of embedding techniques, including static embeddings (GloVe, FastText) and contextual transformer embeddings (BERT, DistilBERT, etc.). To capture predictive confidence and model uncertainty, we include categorical cross-entropy as a key evaluation metric alongside accuracy, precision, recall, and F1-score. In addition to detailed quantitative comparisons, we conduct a systematic qualitative analysis of misclassified samples to reveal model-specific patterns of uncertainty. Our findings show that FastText consistently outperforms GloVe in both traditional and LSTM-based models, particularly in recall, due to its subword-level semantic richness. Transformer-based models demonstrate superior contextual understanding and achieve the highest accuracy (92%) and lowest cross-entropy loss (0.25) with DistilBERT, indicating well-calibrated predictions. To validate the generalisability of our results, we replicated our experiments on the Amazon Gift Card Reviews dataset, where similar trends were observed. We also adopt a resource-aware approach by reducing the dataset size from 25 K to 20 K to reflect real-world hardware constraints. This study contributes to both sentiment analysis and sustainable AI by offering a scalable, entropy-aware evaluation framework that supports informed, context-sensitive model selection for practical applications.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。