Toward Sensor-to-Text Generation: Leveraging LLM-Based Video Annotations for Stroke Therapy Monitoring

迈向传感器到文本的生成:利用基于LLM的视频标注进行中风治疗监测

阅读:3

Abstract

Stroke-related impairment remains a leading cause of long-term disability, limiting individuals' ability to perform daily activities. While wearable sensors offer scalable monitoring solutions during rehabilitation, they struggle to distinguish functional from non-functional movements, and manual annotation of sensor data is labor-intensive and prone to inconsistency. In this paper, we propose a novel framework that uses large language models (LLMs) to generate activity descriptions from video frames of therapy sessions. These descriptions are aligned with concurrently recorded accelerometer signals to create labeled training data. Through exploratory analysis, we demonstrate that accelerometer signals exhibit distinct temporal and statistical patterns corresponding to specific activities, supporting the feasibility of generating natural language narratives directly from sensor data. Our findings lay the foundation for future development of sensor-to-text models that can enable automated, non-intrusive, and scalable stroke rehabilitation monitoring without the need for manual or video-based annotation.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。