Abstract
BACKGROUND: Older adult activity recognition is a critical task in long-term care monitoring; yet, it remains challenging due to postural deformities and health-related variability. These factors cause different activities to appear visually similar, or the same activity to appear dissimilar, undermining the effectiveness of traditional human activity recognition models developed for the general population. OBJECTIVE: This study aims to develop an improved older adult activity recognition method that integrates care assessment information with motion data to capture and understand movement variability arising from different health conditions. METHODS: To achieve our objective, we propose a care-assessment-aware spatiotemporal transformer (CSTT) model that integrates body key points, heatmaps, and care level data for personalized and context-aware activity recognition. The model dynamically adjusts its attention mechanism based on care level context to improve recognition accuracy. CSTT was trained and validated on real-world older adult motion data. A total of 51 older adult participants (30 men and 21 women; age range of 64-95 years) were included in the study. Among them, 7 (13.7%) required high care assistance, 26 (51.0%) required medium care assistance, and 18 (35.3%) required low care assistance. RESULTS: Despite data imbalance and considerable intraclass variation due to differing care needs, the proposed CSTT model achieved an F(1)-score and accuracy of 0.96 and area under the curve is 0.98. Analysis revealed that movement patterns differ significantly across care levels and that similar motions occur in distinct activities, highlighting the importance of care-aware modeling. CONCLUSIONS: Incorporating care level information into activity recognition models significantly enhances performance in older adult care settings. The proposed CSTT framework demonstrates the value of personalized, context-sensitive approaches for accurate and ethical monitoring in long-term care environments.