Abstract
Domain generalization (DG) aims to develop models that perform robustly on unseen target domains, a critical but challenging objective for real-world fault diagnosis. The challenge is further complicated in compound fault diagnosis, where the rigidity of hard labels and the simplicity of label smoothing under-represent inter-class relations and compositional structures, degrading cross-domain robustness. While current domain generalization methods can alleviate these issues, they typically rely on multi-source domain data. However, considering the limitations of equipment operational conditions and data acquisition costs in industrial applications, only one or two independently distributed source datasets are typically available. In this work, an adaptive label refinement network (ALRN) was designed for learning with imperfect labels under source-scarce conditions. Compared to hard labels and label smoothing, ALRN learns richer, more robust soft labels that encode the semantic similarities between fault classes. The model first trains a convolutional neural network (CNN) to obtain initial class probabilities. It then iteratively refines the training labels by computing a weighted average of predictions within each class, using the sample-wise cross-entropy loss as an adaptive weighting factor. Furthermore, a label refinement stability coefficient based on the max-min Kullback-Leibler (KL) divergence ratio across classes is proposed to evaluate label quality and determine when to terminate the refinement iterations. With only one or two source domains for training, ALRN achieves accuracy gains exceeding 22% under unseen operating conditions compared with a conventional CNN baseline. These results validate that the proposed label refinement algorithm can effectively enhance the cross-domain diagnostic performance, providing a novel and practical solution for learning with imperfect supervision in cross-domain compound fault diagnosis.