Abstract
Enhancing low-light images is crucial in computer vision applications. Most existing learning-based models often struggle to balance light enhancement and color correction, while images typically contain different types of information at different levels. Hence, we proposed a multi-scale interactive network with color attention named MSINet to effectively explore these different types of information for lowlight image enhancement (LLIE) tasks. Specifically, the MSINet first employs the CNN-based branch built upon stacked residual channel attention blocks (RCABs) to fully explore the image local features. Meanwhile, the Transformer-based branch constructed by Transformer blocks contains cross-scale attention (CSA) and multi-head self-attention (MHSA) to mine the global features. Notably, the local and global features extracted by each RCAB and Transformer block are interacted with by the fusion module. Additionally, the color correction branch (CCB) based upon self-attention (SA) can learn the color distribution information from the lowlight input for further guaranteeing the color fidelity of the final output. Extensive experiments have demonstrated that our proposed MSINet outperforms state-of-the-art LLIE methods in light enhancement and color correction.