Interactive graph emotion recognition based on multi-modal data enhancement

基于多模态数据增强的交互式图形情感识别

阅读:1

Abstract

In the research of multi-modal emotion recognition, the validity of source data, multi-modal data interaction, and different task dependence are the three important elements for task completion. However, few models take all factors into account in a consistent framework. To replenish this part, we propose a multimodal data-enhanced Interaction Graph (IGEM). Distil-DCCRN is used to help the model learn more robust acoustic features for speech data enhancement. Unsupervised data is introduced for text data enhancement, which increases the diversity of the data set while keeping its original meaning unchanged. Using a small, dense, connected network for video data enhancement, it combines the characteristics of image and time series, and its enhancement strategy is more complex and diversified. For the deep fusion of different modal data, we introduce a cross-modal data coding interaction graph and regard the data of different modes as nodes in the graph, and they are connected through the interaction relationship between modes. Finally, based on the deep fusion representation of the cross-modal data coding interaction graph, accurate emotion classification can be carried out. Experiments were carried out on IEMOCAP and MELD benchmark datasets, and the accuracy rate reached 72% and 47.5% respectively. The superiority of the model is fully proved.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。