Pose2met: a unified spatiotemporal framework for 3D human pose estimation and energy expenditure estimation

Pose2met:一个用于三维人体姿态估计和能量消耗估计的统一时空框架

阅读:1

Abstract

PURPOSE: This study addresses key challenges in 3D human pose estimation (HPE) and energy expenditure estimation (EEE), focusing on handling complex activities, improving generalization, and jointly enhancing both tasks within a unified framework. METHODS: We propose Pose2Met, a unified end-to-end framework that jointly addresses 3D HPE and EEE. At the core of this framework is STAPFormer, a Transformer model with a SpatioTemporal Aggregated Pose (STAP) representation for efficient and accurate motion modeling. Building on this representation, Pose2Met introduces a unified pose-metabolism learning strategy that jointly optimizes pose dynamics and metabolic patterns within a single learning paradigm, enabling the model to directly predict both 3D pose and energy expenditure from 2D pose inputs, achieving performance comparable to the traditional 2D-3D-expenditure pipeline and significantly enhancing computational efficiency and robustness in practical applications. RESULTS: Experiments show that STAPFormer achieves an MPJPE of 38.2 mm on Human3.6M, outperforming MixSTE and STCFormer. For EEE on Vid2Burn-ADL, it achieves 22.1 kcal MAE with pose-based input, comparable to video-based methods. Under the unified learning framework, 2D pose-based EEE further approaches the accuracy of 3D pose-based prediction, demonstrating enhanced robustness and generalization. CONCLUSION: The results highlight the importance of high-quality motion representations for both HPE and EEE. Pose2Met shows strong potential for intelligent fitness and healthcare applications and offers a promising direction for bridging the gap between pose and expenditure estimation.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。