Abstract
In contemporary society, electricity is a critical resource integral to national economies and public welfare, making the safe and stable operation of high-voltage transmission lines paramount. These lines are susceptible to various hazards, including floating objects (e.g., balloons and kites), bird nests, debris, and other foreign objects that may attach to or hang from power lines or towers. Such objects pose significant safety risks by potentially disrupting the normal functioning of the power system. Leveraging the advances in deep learning and computer vision, object detection models such as YOLO have been widely employed in power-grid inspections. However, traditional models often struggle to balance low-frequency background information (e.g., sky, grass) with high-frequency salient structures (e.g., power lines, towers), leading to suboptimal detection accuracy and robustness for small, low-contrast foreign objects. This paper presents an enhanced YOLOv11 detector tailored for high-voltage transmission line environments. First, we integrate a Wavelet-Transform Convolution (WTConv) block into the backbone to decompose features into multi-frequency sub-bands, apply lightweight depth-wise convolution in the wavelet domain, and reconstruct them losslessly, thereby enlarging the effective receptive field while preserving fine structural details. Second, we develop a Progressive Feature Pyramid Network (PFPN) that performs two-stage top-down / bottom-up refinement and employs adaptive spatial fusion to alleviate semantic inconsistency and cross-scale conflicts in cluttered corridor scenes. Third, we introduce an Inner-EIoU loss that focuses regression on the inner region of ground-truth boxes, improving localisation of tiny and low-contrast targets. Extensive experiments on our Transmission-Line Foreign-Object (TLFO) dataset demonstrate the effectiveness of the proposed design. Compared with the YOLOv11 baseline, the improved detector raises mAP₀.₅ from 0.841 to 0.872 and mAP₀.₅:₀.₉₅ from 0.620 to 0.640, increases Precision from 0.918 to 0.962, reduces parameters from 5.97 M to 4.83 M, and boosts inference speed from 24.1 FPS to 28.5 FPS at 1280 × 768 resolution. Additional experiments on MS COCO val2017 show that the proposed modules are not overfitted to the power-grid domain, yielding a + 1.6 point improvement in mAP₀.₅:₀.₉₅ over the YOLOv11 baseline. These results indicate that the combination of WTConv, PFPN, and Inner-EIoU provides a practical, real-time solution for foreign-object detection in power-grid inspection scenarios.