Abstract
Electroencephalogram (EEG)-based emotion recognition has emerged as a compelling direction in affective computing, driven by its ability to provide objective, neural-level insights into emotional states. However, the high-dimensional and complex spatial and functional characteristics of EEG data present substantial challenges for accurate modeling. To address this, we propose Multilayer-GTCN (Multilayer Graph Transformer Convolutional Network), which combines the strengths of Graph Convolutional Networks (GCNs) and Graph Transformer layers to effectively capture both local and global dependencies in EEG signals. The framework employs a dual-graph design over feature nodes: a physical proximity graph instantiated as a complete topology to stabilize information flow, and a functional connectivity graph whose edges are correlations derived from inter-feature relationships. Within this representation, GCN layers consolidate stable relational patterns, while transformer-based graph convolutions capture long-range dependencies and transient interactions across the feature space. Combining the two encoded views results in representations that jointly capture localized structure and global context, providing a robust basis for affective decoding. Extensive experiments on benchmark datasets confirm the effectiveness of our approach, achieving 98.24 ± 1.74% on SEED, 95.82 ± 1.89% on SEED-IV, and 93.35 ± 4.08% (valence) / 94.11 ± 2.98% (arousal) on DEAP. These results highlight the efficiency and flexibility of Multilayer-GTCN across varied datasets. By merging a physical proximity graph with correlation-based functional connectivity in a multilayer architecture, this study lays a foundation for scalable affective-computing systems and delivers a framework to guide upcoming advances in neural signal study.