Abstract
Aero-engine ablation detection is a critical task in aircraft health management, yet existing rotation-based object detection methods often face challenges of high computational complexity and insufficient local feature extraction. This paper proposes an improved YOLOv11 algorithm incorporating Context-guided Large-kernel attention and Rotated detection head, called CLR-YOLOv11. The model achieves synergistic improvement in both detection efficiency and accuracy through dual structural optimization, with its innovations primarily embodied in the following three tightly coupled strategies: (1) Targeted Data Preprocessing Pipeline Design: To address challenges such as limited sample size, low overall image brightness, and noise interference, we designed an ordered data augmentation and normalization pipeline. This pipeline is not a mere stacking of techniques but strategically enhances sample diversity through geometric transformations (random flipping, rotation), hybrid augmentations (Mixup, Mosaic), and pixel-value transformations (histogram equalization, Gaussian filtering). All processed images subsequently undergo Z-Score normalization. This order-aware pipeline design effectively improves the quality, diversity, and consistency of the input data. (2) Context-Guided Feature Fusion Mechanism: To overcome the limitations of traditional Convolutional Neural Networks in modeling long-range contextual dependencies between ablation areas and surrounding structures, we replaced the original C3k2 layer with the C3K2CG module. This module adaptively fuses local textural details with global semantic information through a context-guided mechanism, enabling the model to more accurately understand the gradual boundaries and spatial context of ablation regions. (3) Efficiency-Oriented Large-Kernel Attention Optimization: To expand the receptive field while strictly controlling the additional computational overhead introduced by rotated detection, we replaced the C2PSA module with the C2PSLA module. By employing large-kernel decomposition and a spatial selective focusing strategy, this module significantly reduces computational load while maintaining multi-scale feature perception capability, ensuring the model meets the demands of high real-time applications. Experiments on a self-built aero-engine ablation dataset demonstrate that the improved model achieves 78.5% mAP@0.5:0.95, representing a 4.2% improvement over the YOLOv11-obb which model without the specialized data augmentation. This study provides an effective solution for high-precision real-time aviation inspection tasks.