A human pose estimation network based on YOLOv8 framework with efficient multi-scale receptive field and expanded feature pyramid network

基于YOLOv8框架的人体姿态估计网络,具有高效的多尺度感受野和扩展的特征金字塔网络

阅读:1

Abstract

Deep neural networks are used to accurately detect, estimate, and predict human body poses in images or videos through deep learning-based human pose estimation. However, traditional multi-person pose estimation methods face challenges due to partial occlusions and overlaps between multiple human bodies and body parts. To address these issues, we propose EE-YOLOv8, a human pose estimation network based on the YOLOv8 framework, which integrates Efficient Multi-scale Receptive Field (EMRF) and Expanded Feature Pyramid Network (EFPN). First, the EMRF module is employed to further enhance the model's feature representation capability. Second, the EFPN optimizes cross-level information exchange and improves multi-scale data integration. Finally, Wise-IoU replaces the traditional Intersection over Union (IoU) to improve detection accuracy through precise overlap measurement between predicted and ground-truth bounding boxes. We evaluate EE-YOLOv8 on the MS COCO 2017 dataset. Compared to YOLOv8-Pose, EE-YOLOv8 achieves an AP of 89.0% at an IoU threshold of 0.5 (an improvement of 3.3%) and an AP of 65.6% over the IoU range of 0.5-0.95 (an improvement of 5.8%). Therefore, EE-YOLOv8 achieves the highest accuracy while maintaining the lowest parameter count and computational complexity among all analyzed algorithms. These results demonstrate that EE-YOLOv8 exhibits superior competitiveness compared to other mainstream methods.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。