Adaptive diffusion models for overcoming data scarcity in long-distance face recognition

用于克服远距离人脸识别中数据稀缺问题的自适应扩散模型

阅读:1

Abstract

Long-distance Face Recognition (FR) poses significant challenges due to image degradation and limited training data, particularly in surveillance and security applications where facial images are captured at substantial distances with reduced resolution and quality. This research work introduces Face-Aware Diffusion (FADiff), a novel Adaptive Diffusion Model (ADM) specifically designed to overcome data scarcity and enhance FR performance in long-distance scenarios. The proposed model integrates three core network elements: a Face Condition Embedding Module (FCEM) based on ArcFace-trained ResNet101 with MLP-Mixer for identity-preserving conditioning; a Face-Aware Initial Estimator (FAIE) using modified SwinIR with hierarchical attention for structural initialization; and an ADM with Feature-wise Linear Modulation (FiLM) for high-fidelity, identity-consistent facial reconstruction. FADiff addresses the vital challenge of maintaining facial detection while enhancing image quality through a multi-stage training model that enables stable convergence and superior performance compared to end-to-end alternatives. Comprehensive evaluation on the WIDER-FACE dataset demonstrates FADiff's substantial improvements over state-of-the-art methods, achieving 27.84 dB PSNR, 0.821 SSIM, 0.743 ArcFace similarity, and 0.612 detection AP@0.5 on the challenging Hard subset, representing improvements of 6.3%, 6.9%, 7.1%, and 11.9% over the best baseline method, DiffBIR. Statistical significance testing across 1000 test images confirms highly significant improvements (p < 0.001) with large effect sizes, while ablation studies validate the requirement of each model component. The model proves excellent scalability across multiple resolutions, achieving higher performance in extreme 4 × upscaling scenarios (32 × 32 → 128 × 128) with a PSNR of 25.71 dB, compared to 23.84 dB for DiffBIR. Computational efficiency analysis reveals the practical training requirements (24.7 h, 16.3 GB peak memory) and practical implication performance (189 ms at 128 × 128 resolution), making FADiff suitable for real-world deployment in surveillance and security applications where quality and computational constraints are critical.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。