Abstract
Surface defects in steel severely impact product reliability and service life. However, achieving satisfactory defect detection results is challenging due to the need to balance detection accuracy and efficiency in practical applications, the wide variation in surface defect sizes, and the presence of complex backgrounds and minute defect targets in production processes. In this paper, we propose the Multi-stage Calibration Fusion Network, named RMCF-Net. Firstly, we design an Efficient Feature Extraction Module (E-C2f) that employs heterogeneous processing to enrich features, fully capturing defect details while ensuring efficient detection without losing critical structural information. Secondly, we introduce a Lightweight Feature Aggregation Module (LFAM) to pre-filter multi-level feature maps from the neck network, mitigating feature conflicts and achieving optimised integration to minimise information loss for minute target defects. Subsequently, the Multi-stage U-shaped Fusion Network (MUF-Net) was introduced. This network re-evaluates cross-layer feature interactions by implementing an adaptive feature re-weighting strategy within its U-shaped bidirectional propagation path, enabling deep interaction between adjacent and cross-level features. Finally, an adaptive adjustment function (QCIoU) integrating perimeter and aspect ratio was proposed to guide the model in further reducing spatial deviation between predicted and ground-truth bounding boxes, thereby enhancing model accuracy. Comparative experiments on the NEU-DET, GC10-DET, and Track Components datasets achieve mAP@0.5 of 78.9%, 67.1%, and 71.9%, respectively, while maintaining 119 FPS. This delivers state-of-the-art performance, providing reliable technical assurance for quality control in intelligent manufacturing.