Abstract
Multi-focus image fusion (MFIF) is an image-processing method that aims to generate fully focused images by integrating source images from different focal planes. However, the defocus spread effect (DSE) often leads to blurred or jagged focus/defocus boundaries in fused images, which affects the quality of the image. To address this issue, this paper proposes a novel model that embeds the Kolmogorov-Arnold network with convolutional layers in parallel within the U-Net architecture (KCUNet). This model keeps the spatial dimensions of the feature map constant to maintain high-resolution details while progressively increasing the number of channels to capture multi-level features at the encoding stage. In addition, KCUNet incorporates a content-guided attention mechanism to enhance edge information processing, which is crucial for DSE reduction and edge preservation. The model's performance is optimized through a hybrid loss function that evaluates in several aspects, including edge alignment, mask prediction, and image quality. Finally, comparative evaluations against 15 state-of-the-art methods demonstrate KCUNet's superior performance in both qualitative and quantitative analyses.