TE-TransReID: Towards Efficient Person Re-Identification via Local Feature Embedding and Lightweight Transformer

TE-TransReID:基于局部特征嵌入和轻量级Transformer的高效行人重识别

阅读:1

Abstract

Person re-identification aims to match images of the same individual across non-overlapping cameras by analyzing personal characteristics. Recently, Transformer-based models have demonstrated excellent capabilities and achieved breakthrough progress in this task. However, their high computational costs and inadequate capacity to capture fine-grained local features impose significant constraints on re-identification performance. To address these challenges, this paper proposes a novel Toward Efficient Transformer-based Person Re-identification (TE-TransReID) framework. Specifically, the proposed framework retains only the former L-th layer layers of a pretrained Vision Transformer (ViT) for global feature extraction while combining local features extracted from a pretrained CNN, thus achieving the trade-off between high accuracy and lightweight networks. Additionally, we propose a dual efficient feature-fusion strategy to integrate global and local features for accurate person re-identification. The Efficient Token-based Feature-Fusion Module (ETFFM) employs the gate-based network to learn fused token-wise features, while the Efficient Patch-based Feature-Fusion Module (EPFFM) utilizes a lightweight Transformer to aggregate patch-level features. Finally, TE-TransReID achieves a rank-1 of 94.8%, 88.3%, and 85.7% on Market1501, DukeMTMC, and MSMT17 with a parameter of 27.5 M, respectively. Compared to existing CNN-Transformer hybrid models, TE-TransReID maintains comparable recognition accuracy while drastically reducing model parameters, establishing an optimal equilibrium between recognition accuracy and computational efficiency.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。