Abstract
Image classification and segmentation are important tasks in computer vision. ResNet and U-Net are representative networks for image classification and image segmentation, respectively. Although many scholars used to fuse these two networks, most integration focuses on image segmentation with U-Net, overlooking the capabilities of ResNet for image classification. In this paper, we propose a novel U-ResNet structure by combining U-Net's convolution-deconvolution structure (UBlock) with ResNet's residual structure (ResBlock) in a parallel manner. This novel parallel structure achieves rapid convergence and high accuracy in image classification and segmentation while also efficiently alleviating the vanishing gradient problem. Specifically, in the UBlock, the pixel-level features of both high- and low-resolution images are extracted and processed. In the ResBlock, a Selected Upsampling (SU) module was introduced to enhance performance on low-resolution datasets, and an improved Efficient Upsampling Convolutional Block (EUCB*) with a Channel Shuffle mechanism was added before the output of the ResBlock to enhance network convergence. Features from both the ResBlock and UBlock were merged for better decision making. This architecture outperformed the state-of-the-art (SOTA) models in both image classification and segmentation tasks on open-source and private datasets. Functions of individual modules were further verified via ablation studies. The superiority of the proposed U-ResNet displays strong feasibility and potential for advanced cross-paradigm tasks in computer vision.