Abstract
As deep learning technologies gradually penetrate various industries, the issue of data scarcity has become a key factor restricting their widespread application and further development. The existing image classification models typically use the Generative Adversarial Network (GAN) to expand the amount of data. However, the GAN focuses solely on generating spatial domain features, overlooking the complementary role of frequency domain information in image representation. In addition, these models assign the same loss weight to both real and generated samples, failing to effectively reflect the contribution differences of these samples during model training. To address these issues, this paper proposes a fully supervised image classification model (D2S-DiffGAN) under limited labeled samples. First, a dual-domain synchronous GAN (DDSGAN) is constructed that constrains the generator from both the spatial and frequency domains. This ensures that the generated samples satisfy both the visual realism of RGB images and the consistency of frequency domain energy distribution, resulting in more diversity and realism. Second, a multi-branch feature extraction network (MBFE) is designed to capture the local texture features, global semantic features, and cross-channel correlation features of the samples. Meanwhile, an attention module is introduced to dynamically fuse multi-dimensional features, further enhancing the channel feature representation relevant to the task. Finally, a differentiated loss function (DIFF) is proposed, setting different loss weights based on the characteristics of generated samples and real images, thereby more reasonably handling the differences between generated and real samples and optimizing the model training process. Extensive experiments on the SVHN and CIFAR-10 datasets show that the proposed model can still achieve good classification accuracy under limited labeled samples, fully validating its effectiveness.