Abstract
The rise of online physical education in higher education has improved accessibility but presents challenges in recognizing complex movements and delivering individualized feedback. Existing action recognition models are often computationally intensive and struggle to generalize across diverse skeletal patterns. To address this, we propose a lightweight graph convolutional network (GCN) that integrates an improved Ghost module with multi-attention mechanisms, including a global attention mechanism (GAM) and a channel attention mechanism (CAM), to enhance spatial and temporal feature extraction. The model is trained end-to-end on 3D skeleton sequences and optimized for real-time efficiency. The computational cost is evaluated in terms of giga floating-point operations (GFLOPs), with the proposed model requiring only 6.2 GFLOPs per inference, over 60% less than the baseline ST-GCN. Experimental results on the NTU60RGB+D dataset demonstrate that the model achieves 90.8% accuracy in cross-subject and 96.8% in cross-view settings. These findings highlight the model's effectiveness in balancing accuracy and efficiency, with promising applications in online physical education, rehabilitation monitoring, elderly movement analysis, and VR-based interfaces.