Abstract
The automatic interpretation of dance motion sequences from inertial and pressure-based sensors requires models that maintain strict temporal fidelity, preserve orientation invariance, and produce verifiable reasoning suitable for real-time deployment on constrained hardware. Existing approaches frequently lose accuracy when confronted with heterogeneous choreography, variable sensor placement, or shifting performer-specific kinematics, and they offer limited access to the decision-relevant evidence that underlies their classifications. This study introduces an explainable sensor-driven dance recognition framework constructed around an Adaptive Sensor Normalisation module that performs quaternion-based orientation correction with drift-aware Kalman refinement, a Multi-Scale Motion Feature Extractor that applies tempo-conditioned dilation schedules to capture micro-step transitions and phrase-level rhythmic structures, and a Spatio-Temporal Graph Attention Core that integrates edge-weighted graph convolutions with dual spatial-temporal attention to quantify sensor saliency and temporal concentration. A final Explainable Decision and Feedback Layer links prototype-anchored latent representations with gradient-resolved saliency vectors to expose class-specific motion determinants. The system is optimized for edge-class execution through kernel-level compression and causal attention windows operating on a 512-sample sliding segment. Experiments on three inertial datasets indicate classification accuracy up to 94.2 percent, movement-quality estimation of 92.8 percent, and sub-8.5 millisecond per-frame latency that confirms stability under tempo variation, sensor drift, and partial channel loss.