Abstract
Addressing the core challenges of weak defect signatures and difficult unknown defect identification in magnetic flux leakage (MFL) inspection of large-bore pipelines, this study proposes an intelligent detection method that integrates a Zero-Shot Structure-Preserving Diffusion Model (ZSSPDM) and cross-modal attention fusion. A Structure-Preserving Diffusion Model (SPDM) is designed to explicitly preserve defect edges and geometric structures during denoising via triple constraints—gradient consistency loss, morphological similarity loss, and frequency-domain regularization—enhancing the signal-to-noise ratio (SNR) of original MFL signals from 12.3 dB to 24.1 dB for high-quality feature input. A gated multi-head cross-modal attention network is constructed, taking MFL signals as queries to dynamically integrate ultrasonic testing (UT) and infrared (IR) features, mitigating inter-modal redundancy and conflicts while achieving a macro F1-score of 0.93 on known defect classes, outperforming early and late fusion strategies. A zero-shot recognizer based on visual-semantic dual-stream embedding is developed, establishing a semantic attribute space (geometry, depth level, causal type, directionality) and leveraging contrastive learning to enable knowledge transfer between known and unknown classes. On a test set containing four unseen defect categories, the method achieves a Zero-Shot Learning (ZSL) Accuracy of 0.84 and an H-Mean of 0.88, surpassing mainstream models such as GLEE and TransMIL. Cross-material and cross-pipeline tests demonstrate an average ZSL Accuracy of 0.81, confirming strong generalization and engineering applicability. This work provides a high-precision, robust solution for intelligent pipeline inspection, with significant advantages in signal enhancement, multimodal fusion, and zero-shot generalization.