Abstract
Multimodal emotion recognition has shown growing interest in affective computing, as combining Electroencephalogram (EEG) and eye movement (EM) signals enables the capture of complex emotional processes. However, EEG and EM signals are exposed to joint distribution differences across different days and recorded sessions, reducing the recognition performance. Currently, domain adaptation has been developed to address such distribution differences. Unfortunately, existing domain adaptation solutions still show suboptimal classification results, since ambiguous and non-discriminative decision boundaries are still learned during distribution matching. This paper presents Joint Distribution Alignment with Refined and Separable Decision Boundaries (JDA-RSDB), a multimodal domain adaptation method for cross-session emotion recognition from EEG and EM signals. Our proposed method assumes that a more discriminative feature representation must be ensured on new sessions during joint distribution matching. For this, JDA-RSDB produces similar marginal and conditional distributions between domains, first aligning feature statistics at modality and domain levels, and then, motivating consistent similarity between fused samples from different domains that produce the same class prediction. Simultaneously, this similarity is enhanced by learning a separable feature space on target data, placing decision boundaries on low-density regions. More importantly, decision boundaries are refined by achieving an agreement between target predictions from a principal classifier and those from an auxiliary classifier. Experiments were conducted on three public datasets, SEED-GER, SEED-IV, and SEED-V, in a cross-session setting. The proposed framework achieves an average accuracy of 83.33%, 80.89%, and 75.17% across the three available sessions on SEED-GER, SEED-IV, and SEED-V, outperforming state-of-the-art solutions.