Abstract
The fusion of infrared and visible images provides complementary information from both modalities and has been widely used in surveillance, military, and other fields. However, most of the available fusion methods have only been evaluated with subjective metrics of visual quality of the fused images, which are often independent of the following relevant high-level visual tasks. Moreover, as a useful technique especially used in low-light scenarios, the effect of low-light conditions on the fusion result has not been well-addressed yet. To address these challenges, a decoupled and semantic segmentation-driven infrared and visible image fusion network is proposed in this paper, which connects both image fusion and the downstream task to drive the network to be optimized. Firstly, a cross-modality transformer fusion module is designed to learn rich hierarchical feature representations. Secondly, a semantic-driven fusion module is developed to enhance the key features of prominent targets. Thirdly, a weighted fusion strategy is adopted to automatically adjust the fusion weights of different modality features. This effectively merges the thermal characteristics from infrared images and detailed information from visible images. Additionally, we design a refined loss function that employs the decoupling network to constrain the pixel distributions in the fused images and produce more-natural fusion images. To evaluate the robustness and generalization of the proposed method in practical challenge applications, a Maritime Infrared and Visible (MIV) dataset is created and verified for maritime environmental perception, which will be made available soon. The experimental results from both widely used public datasets and the practically collected MIV dataset highlight the notable strengths of the proposed method with the best-ranking quality metrics among its counterparts. Of more importance, the fusion image achieved with the proposed method has over 96% target detection accuracy and a dominant high mAP@[50:95] value that far surpasses all the competitors.