Abstract
Industrial anomaly detection algorithms based on Convolutional Neural Networks (CNN) often struggle with identifying small anomaly regions and maintaining robust performance in noisy industrial environments. To address these limitations, this paper proposes the Swin Transformer-Based Hybrid Reconstruction Discriminative Network (SRDAD), which combines the global context modeling capabilities of Swin Transformer with complementary reconstruction and discrimination approaches. Our approach introduces three key contributions: a natural anomaly image generation module that produces diverse simulated anomalies resembling real-world defects; a Swin-Unet based reconstruction subnetwork with enhanced residual and pooling modules for accurate normal image reconstruction, utilizing hierarchical window attention mechanisms, and an anomaly contrast discrimination subnetwork based on convolutional Unet that enables end-to-end detection and localization through contrastive learning. This hybrid approach combines reconstruction and discrimination paradigms to improve anomaly detection performance. Experimental results on the industrial dataset MVTec AD demonstrate that SRDAD achieves competitive performance, with improvements of 0.6% in detection accuracy and 0.7% in localization precision. The method demonstrates improved performance in detecting small anomalies and maintaining performance in noisy environments, highlighting its potential for industrial applications.