Abstract
Traditional police combat training relies heavily on subjective evaluation by human instructors, which lacks consistency and comprehensive coverage of complex movement patterns in real-world scenarios. This paper presents an enhanced deep spatio-temporal graph convolutional network (ST-GCN) framework specifically designed for automated police combat action recognition and quality assessment. The proposed method incorporates adaptive graph topology learning mechanisms that dynamically adjust spatial connectivity patterns based on action-specific joint relationships, multi-modal fusion strategies combining skeletal and RGB video data for robust recognition under diverse environmental conditions, and comprehensive quality assessment algorithms providing objective evaluation of technique execution. The enhanced ST-GCN architecture features attention-guided feature extraction, curriculum learning-based training strategies, and real-time processing capabilities suitable for practical deployment in training facilities. Experimental validation on a comprehensive police combat dataset demonstrates superior performance with 96.7% recognition accuracy across twelve action categories and real-time processing at 42.8 frames per second. The multi-dimensional evaluation framework successfully assesses action completion, standardization compliance, and movement fluency, providing immediate feedback for skill development. The proposed system offers significant improvements over conventional approaches, enabling standardized evaluation criteria, data-driven curriculum development, and enhanced training effectiveness for law enforcement personnel.