Abstract
Achieving high-precision anomaly detection with incomplete sensor data is a critical challenge in industrial automation and intelligent manufacturing. This incompleteness often results from sensor failures, environmental interference, occlusions, or acquisition cost constraints. This study explicitly targets both types of incompleteness commonly encountered in industrial multimodal inspection: (i) incomplete sensor data within a given modality, such as partial point cloud loss or image degradation, and (ii) incomplete modalities, where one sensing channel (RGB or 3D) is entirely unavailable. By jointly addressing intra-modal incompleteness and cross-modal absence within a unified cross-distillation framework, our approach enhances anomaly detection robustness under both conditions. First, a teacher-student cross-modal distillation mechanism enables robust feature learning from both RGB and 3D modalities, allowing the student network to accurately detect anomalies even when a modality is missing during inference. Second, a dynamic voxel resolution adjustment with edge-retention strategy alleviates the computational burden of 3D point cloud processing while preserving crucial geometric features. By jointly enhancing robustness to missing modalities and improving computational efficiency, our method offers a resilient and practical solution for anomaly detection in real-world manufacturing scenarios. Extensive experiments demonstrate that the proposed method achieves both high robustness and efficiency across multiple industrial scenarios, establishing new state-of-the-art performance that surpasses existing approaches in both accuracy and speed. This method provides a robust solution for high-precision perception under complex detection conditions, significantly enhancing the feasibility of deploying anomaly detection systems in real industrial environments.