Virtual reality-induced emotion recognition with deep learning-based multimodal physiological feature fusion

基于深度学习的多模态生理特征融合的虚拟现实诱导情绪识别

阅读:1

Abstract

OBJECTIVE: Recognizing emotions objectively and accurately remains challenging because of the limited ecological validity, informational incompleteness, and constrained model performance of conventional approaches. This study addresses these limitations holistically by investigating a novel framework that integrates ecologically valid virtual reality (VR) for emotion elicitation with deep learning-based multimodal physiological signal fusion. METHODS: An immersive VR environment was developed to effectively elicit three target emotional states: positive, neutral, and negative. Synchronized physiological signals-electroencephalography (EEG), electrocardiography (ECG), and galvanic skin response (GSR)-were recorded from 20 healthy participants alongside subjective self-assessment data. After preprocessing and feature extraction, a nested cross-validation procedure was employed to prevent data leakage: within each of the five folds, feature selection (one-way repeated-measures ANOVA, α = 0.05) was performed solely on the training data. A hybrid network architecture combining principal component analysis (PCA) with long short-term memory (LSTM) was employed for dimensionality reduction and modeling. The PCA retained components explaining 90% cumulative variance, while the LSTM layer contained 96 hidden units, followed by three fully connected layers with integrated dropout regularization. Model performance was evaluated using this rigorous cross-validation framework and compared against baseline models including support vector machine (SVM), random forest (RF), k-nearest neighbors (k-NN), and extreme gradient boosting (XGBoost). RESULTS: Subjective evaluation results confirmed the effectiveness of VR-induced emotion elicitation. At the group level, one-way repeated-measures analysis of variance revealed significant main effects of emotional states (p < 0.05) on multiple physiological features: EEG frontal alpha asymmetry indices (AI_F4/F3, AI_F8/F7), ECG indices (SDNN, RMSSD, LF/HF ratio, sample entropy), and GSR measures (SCL, NS.SCRs). Employing a nested cross-validation framework to prevent data leakage, the PCA-LSTM model achieved a mean accuracy of 87.18% ± 2.28% under five-fold cross-validation, significantly outperforming SVM (75.83% ± 4.25%), RF (78.89% ± 6.85%), k-NN (72.78% ± 5.21%), and XGBoost (81.67% ± 5.83%). CONCLUSION: This study validates that integrating an ecologically valid VR emotion elicitation paradigm with a multimodal PCA-LSTM fusion model effectively enhances the objectivity and accuracy of emotion recognition. The proposed framework provides an effective solution to overcome the bottlenecks of ecological validity and quantification precision in traditional methods, demonstrating preliminary application potential in intelligent human-computer interaction and mental-health monitoring domains.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。