Abstract
In response to the challenges of complex background interference, inadequate feature utilization, and model redundancy in multispectral crown extraction, this paper proposes a dual-channel crown detection and segmentation approach based on an improved YOLOv7 architecture, named Dual-YOLOv7. First, a dual-branch feature extraction network is designed, integrating visible light and infrared spectral information and dynamically weights key features through an attention mechanism. Second, the D-SimSPPF module is introduced, which employs depthwise separable convolution to optimize spatial pyramid pooling, thereby enhancing the capability to capture fine details while reducing the number of parameters. Furthermore, the CIoU-C loss function is developed, incorporating a shape penalty factor to improve the accuracy of bounding box regression. Experimental results demonstrate that the improved model achieves detection and segmentation mAP(50) scores of 91.6% and 90.1%, respectively, representing increases of 7.7 and 7.6 percentage points over YOLOv7-seg. After channel pruning, the model parameter count is reduced by 14.2%, offering a lightweight solution suitable for unmanned aerial vehicle platforms.