Abstract
BACKGROUND/OBJECTIVES: Sleep stage classification is crucial for assessing sleep quality and diagnosing related disorders. Electroencephalography (EEG) is currently recognized as a primary method for sleep stage classification. High-performance automatic sleep staging methods based on EEG leverage the powerful contextual modeling capabilities of Transformer Encoder architectures. However, the global self-attention mechanism in Transformers incurs significant computational overhead, substantially hindering the training and inference efficiency of automatic sleep staging algorithms. METHODS: To address these issues, we introduce an end-to-end framework for automatic sleep stage classification using single-channel EEG: SleepMFormer. At the algorithmic level, SleepMFormer adopts a task-driven simplification of the Transformer encoder to improve attention efficiency while preserving sequence modeling capability. At the training level, supervised contrastive learning is incorporated as an auxiliary strategy to enhance representation robustness. From an engineering perspective, these design choices enable efficient training and inference under resource-constrained settings. RESULTS: When integrated with the SleePyCo backbone, the proposed framework achieves competitive performance on three widely used public datasets: Sleep-EDF, PhysioNet, and SHHS. Notably, SleepMFormer reduces training and inference time by up to 33% compared to conventional self-attention-based models. To further validate the generalizability of MaxFormer, we conduct additional experiments using DeepSleepNet and TinySleepNet as alternative feature extractors. Experimental results demonstrate that MaxFormer consistently maintains performance across different model architectures. CONCLUSIONS: Overall, SleepMFormer introduces an efficient and practical framework for automatic sleep staging, demonstrating strong potential for related clinical applications.