Abstract
Recent progress in deep learning has improved fault diagnosis, but such models typically demand large, labeled data and struggle with overfitting when data is limited or imbalanced. Collecting data under faulty machine conditions is challenging, as artificially inducing faults can damage other parts. Additionally, feature extraction pipelines often produce high-dimensional, redundant representations that hinder interpretability and efficiency. To overcome these challenges, this paper introduces a sensor-fused and data-efficient framework for centrifugal pump fault diagnosis under varying pressure conditions. The proposed method uses an autoregressive (AR) observer to model normal-class signals across multiple sensors and extract residuals indicative of faults. These residuals are further processed to compute statistical and spectral descriptors such as RMS and band power. To remove irrelevant features and reduce dimensionality, an Auto-Permutation Feature Importance mechanism (Auto-PFI) is used, yielding a compact and discriminative feature set. Gaussian Mixture Model (GMM) is then trained for class-wise density estimation and fault classification. The framework is validated on datasets acquired at 3, 3.5, and 4 bar pressure levels achieving accuracies of above 99% across all pressures, and its performance is benchmarked against single-sensor setups and state-of-the-art method. Visualization tools including t-SNE, ROC curves, and confusion matrices confirm the robustness and generalization of the approach. Results demonstrate that integrating AR-based residual modeling with Auto-PFI and GMM classification offers a reliable, interpretable, and low-data-demanding solution for centrifugal pumps fault diagnosis.