Abstract
Accurate and efficient small object detection in thermal infrared images remains a critical challenge due to inherent issues such as low contrast, limited texture, and deployment constraints on edge platforms like Unmanned Aerial Vehicle (UAV). This paper presents PWL-RTDETR, an efficient Transformer-based framework specifically designed for infrared small object detection. The proposed model incorporates a novel Partial Convolutional Reparameterization Block (PConvRep-Block), which fuses Partial Convolution and reparameterization to support multi-branch training and single-path inference, significantly reducing computation without compromising representation quality. To enhance multi-scale feature aggregation, we introduce WTRCSPNeck, a lightweight neck architecture integrating CNCSPELAN and WTConv modules. CNCSPELAN improves gradient flow and feature representation through structural reparameterization, while WTConv employs multi-level wavelet decomposition to effectively expand the receptive field and capture both global context and fine-grained details. Furthermore, we adopt Layer-Adaptive Magnitude-based Pruning to achieve global sparsification with layer-wise adaptability, enabling further compression while maintaining model accuracy. Comprehensive evaluations on the HIT-UAV and LLVIP infrared datasets confirm that PWL-RTDETR surpasses existing state-of-the-art models in accuracy, while achieving substantial reductions in parameters and FLOPs. The results highlight the model's suitability for real-time deployment in resource-constrained infrared perception scenarios.