Abstract
INTRODUCTION: In real-world sports scenarios, Human Action Recognition (HAR) is often hindered by data complexity, limited dynamic adaptability, and fragmented integration of physiological and kinematic information. To address these challenges, this study proposes a multimodal HAR framework for personalized sports health promotion by integrating wearable sensor streams with deep learning architectures. METHODS: The proposed system employs a robust sensing layer to capture 12-dimensional multimodal data and synchronize physiological indicators with behavioral signals in real time. A novel Transformer-GCN hybrid model was developed to extract complex spatiotemporal dependencies for accurate action recognition and dynamic state analysis. In addition, a reinforcement learning module was incorporated to generate adaptive exercise prescriptions based on user progress. The framework was deployed through a responsive interface for real-time intervention and evaluated in a 12-week randomized controlled trial. RESULTS: The results demonstrated that the proposed framework achieved effective multimodal fusion and reliable action recognition in sports scenarios. After the 12-week intervention, participants in the intervention group showed a 20.1% increase in cardiorespiratory fitness (VO (2) max), a 99.3% improvement in muscular endurance, and a sports injury rate maintained below 15%. These findings indicate that the framework can support accurate motion analysis and safe, personalized intervention. DISCUSSION: The proposed multimodal fusion architecture effectively bridges the gap between action recognition and personalized sports health intervention. By combining wearable sensing, hybrid deep learning, and reinforcement learning, the framework provides a practical solution for AI-driven motion analysis and adaptive health promotion in land sports scenarios.