A Dual-Branch Spatial Interaction and Multi-Scale Separable Aggregation Driven Hybrid Network for Infrared Image Super-Resolution

一种基于双分支空间交互和多尺度可分离聚合的混合网络用于红外图像超分辨率

阅读:1

Abstract

Single image super-resolution (SISR) is a classical computer vision task that aims to reconstruct a high-resolution image from a low-resolution input, thereby improving detail sharpness and visual quality. In recent years, convolutional neural network (CNN)-based methods and transformer-based methods using self-attention mechanisms have achieved significant progress in visible-image super-resolution. However, the direct application of these two types of methods to infrared images still poses considerable challenges. On the one hand, infrared images generally suffer from low signal-to-noise ratio, blurred edges, and missing details, and relying only on local convolutions makes it difficult to adequately model long-range dependencies across regions. On the other hand, although pure transformer models have a strong global modeling ability, they usually have large numbers of parameters and are sensitive to the amount of training data, making it difficult to balance efficiency and detail restoration in infrared imaging scenarios. To address these issues, we propose a hybrid neural network architecture for infrared image super-resolution reconstruction, termed RDSR (Residual Dual-branch Separable Super-Resolution Network), which organically integrates multi-scale depthwise separable convolutions with shifted-window self-attention. Specifically, we design a dual-branch spatial interaction module (BDSI, Dual-Branch Spatial Interaction) and a multi-scale separable spatial aggregation module (MSSA, Multi-Scale Separable Spatial Aggregation). The BDSI module models correlations along rows and columns through grouped convolutions in the horizontal and vertical directions, effectively strengthening the spatial information interaction between the convolution branch and the self-attention branch. The MSSA module replaces the conventional MLP with three parallel depthwise separable convolution branches, improving the feature representation and nonlinear modeling through multi-scale spatial aggregation and a star-shaped gating operation. The experimental results on multiple public infrared image datasets show that for ×2 and ×4 upscaling, the proposed RDSR achieves higher PSNR and SSIM values than CNN-based methods such as EDSR, RCAN, and RDN, as well as transformer-based methods such as SwinIR, DAT, and HAT, demonstrating the effectiveness of the proposed modules and the overall framework.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。