Abstract
Occluded person re-identification (Re-ID) faces significant challenges, mainly due to the interference of occlusion noise and the scarcity of realistic occluded training data. Although data augmentation is a commonly used solution, the current occlusion augmentation methods suffer from the problem of dual inconsistencies: intra-sample inconsistency is caused by misaligned synthetic occluders (an augmentation operation for simulating real occlusion situations); i.e., randomly pasted occluders ignore spatial prior information and style differences, resulting in unrealistic artifacts that mislead feature learning; inter-sample inconsistency stems from information loss during random cropping (an augmentation operation for simulating occlusion-induced information loss); i.e., single-scale cropping strategies discard discriminative regions, weakening the robustness of the model. To address the aforementioned dual inconsistencies, this study proposes the unified Multi-Aligned and Multi-Scale Augmentation (MA-MSA) framework based on the core principle of "synthetic data should resemble real-world data". First, the Frequency-Style-Position Data Augmentation (FSPDA) module is designed: it ensures consistency in three aspects (frequency, style, and position) by constructing an occluder library that conforms to real-world distribution, achieving style alignment via adaptive instance normalization and optimizing the placement of occluders using hierarchical position rules. Second, the Multi-Scale Crop Data Augmentation (MSCDA) strategy is proposed. It eliminates the problem of information loss through multi-scale cropping with non-overlapping ratios and dynamic view fusion. In addition, different from the traditional serial augmentation method, MA-MSA integrates FSPDA and MSCDA in a parallel manner to achieve the collaborative resolution of dual inconsistencies. Extensive experiments on Occluded-Duke and Occluded-REID show that MA-MSA achieves state-leading performance of 73.3% Rank-1 (+1.5%) and 62.9% mAP on Occluded-Duke, and 87.3% Rank-1 (+2.0%) and 82.1% mAP on Occluded-REID, demonstrating superior robustness without auxiliary models.