Abstract
Image fusion is a sophisticated improvement technique aimed at integrating data from multiple sensors into a single, detailed and coherent image. This integrated representation supports and simplifies subsequent processing tasks. This article presents a well-crafted fusion method depending on Multi-Head Attention (MA) and Residual Dense Block (RDB) based Generative Adversarial Network (AIR-GANet) for infrared and visible image fusion. The framework utilizes RDB for extracting local features, Multi-Head Attention for capturing global dependencies, ensuring a comprehensive representation of both infrared and visible modalities. A dual discriminator mechanism is incorporated with separate discriminators to refine the fused image through adversarial training, preserving both thermal details and texture information. The fusion module incorporates local and global features into a unified feature map, which is then decoded into the final fused image. By applying AIR-GANet to input infrared ([Formula: see text]and visible ([Formula: see text]) images, the resulting fused image maintains crucial informative data from both sources, thus improving overall image quality and enabling high-performance across multiple quantitative evaluation metrics. Extensive evaluations on the TNO, Road Scene and Multi-Spectral Road Scenarios (MSRS) datasets demonstrate that the model attains average Entropy (EN) of 7.4085 and Mutual Information (MI) of 3.1531 on TNO, corresponding values of 7.7086 and 3.6953 on Road Scene, 7.8260 and 3.8502 on MSRS confirming its superior fusion capability across diverse scenarios.