Micro-Expression Recognition via LoRA-Enhanced DinoV2 and Interactive Spatio-Temporal Modeling

基于LoRA增强型DinoV2和交互式时空建模的微表情识别

阅读:2

Abstract

Micro-expression recognition (MER) is challenged by a brief duration, low intensity, and heterogeneous spatial frequency patterns. This study introduces a novel MER architecture that reduces computational cost by fine-tuning a large feature extraction model with LoRA, while integrating frequency-domain transformation and graph-based temporal modeling to minimize preprocessing requirements. A Spatial Frequency Adaptive (SFA) module decomposes high- and low-frequency information with dynamic weighting to enhance sensitivity to subtle facial texture variations. A Dynamic Graph Attention Temporal (DGAT) network models video frames as a graph, combining Graph Attention Networks and LSTM with frequency-guided attention for temporal feature fusion. Experiments on the SAMM, CASME II, and SMIC datasets demonstrate superior performance over existing methods. On the SAMM 5-class setting, the proposed approach achieves an unweighted F1 score (UF1) of 81.16% and an unweighted average recall (UAR) of 85.37%, outperforming the next best method by 0.96% and 2.27%, respectively.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。