Visual intelligence for efficient human action recognition in human computers interaction applications

用于人机交互应用中高效识别人类动作的视觉智能

阅读:1

Abstract

Human Action Recognition (HAR) is a pivotal area in computer vision, video surveillance, and human-computer interaction (HCI), driven by the need for efficient and accurate models to enhance HCI experiences. Traditional HAR methods often rely on hand-crafted features and shallow learning techniques, which limits their ability to capture complex patterns. In contrast, this study proposes an efficient HAR model that leverages deep neural networks, specifically a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), to enhance HCI through AI-powered action understanding. The model employs a pre-trained EfficientNetB7 network to extract rich spatial features from video frames, followed by a Long Short-Term Memory (LSTM) network to capture long-range temporal dependencies. This architecture enhances recognition accuracy while reducing computational complexity, making it highly suitable for HCI applications. Experimental results demonstrate the superior performance of the model, achieving a classification accuracy of 97.8% on the UCF101 dataset and 80.1% on the HMDB51 dataset, outperforming state-of-the-art HAR models. The proposed model eliminates the need for auxiliary assistive techniques like data augmentation, highlighting its efficiency and tremendous potential for real-world HCI applications that rely on accurate and efficient recognition of human actions.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。