Enhanced graph attention network by integrating Long Short-Term Memory for artificial emotion representation in multi-modality datasets

通过整合长短期记忆网络增强图注意力网络,用于多模态数据集中的人工情感表征

阅读:1

Abstract

Emotion representation is a critical aspect of artificial intelligence, particularly in human-computer interaction and affective computing. Emotion recognition from multi-modal data remains challenging due to the complex semantic relationships between textual, audio, and visual features. This study proposes a hybrid model combining Enhanced Graph Attention Networks and Bidirectional Long Short-Term Memory to address this challenge. First, E-GAT captures structural dependencies between emotional features by constructing a semantic graph from text embeddings. Second, Bi-LSTM models temporal dynamics of sequential data, enabling effective integration of contextual information. We evaluated the model on three benchmark datasets: SemEval-2018 (text-only), RAVDESS (audio-visual), and CMU-MOSEI (multi-modal). Experimental results show that the proposed model achieves state-of-the-art performance: 58.5% accuracy and 68.7% F1-score on SemEval-2018, outperforming baseline models. On multi-modal datasets, it achieves 78.9% accuracy (RAVDESS) and 82.3% accuracy (CMU-MOSEI), demonstrating robust cross-modal generalization. This work advances emotion recognition by providing a unified framework for both text-only and multi-modal scenarios, with applications in human-computer interaction and mental health monitoring.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。