Addressing Class Imbalance with Latent Diffusion-based Data Augmentation for Improving Disease Classification in Pediatric Chest X-rays

利用基于潜在扩散的数据增强方法解决类别不平衡问题,以提高儿科胸部X光片疾病分类的准确性

阅读:1

Abstract

Deep learning (DL) has transformed medical image classification; however, its efficacy is often limited by significant data imbalance due to far fewer cases (minority class) compared to controls (majority class). It has been shown that synthetic image augmentation techniques can simulate clinical variability, leading to enhanced model performance. We hypothesize that they could also mitigate the challenge of data imbalance, thereby addressing overfitting to the majority class and enhancing generalization. Recently, latent diffusion models (LDMs) have shown promise in synthesizing high-quality medical images. This study evaluates the effectiveness of a text-guided image-to-image LDM in synthesizing disease-positive chest X-rays (CXRs) and augmenting a pediatric CXR dataset to improve classification performance. We first establish baseline performance by fine-tuning an ImageNet-pretrained Inception-V3 model on class-imbalanced data for two tasks-normal vs. pneumonia and normal vs. bronchopneumonia. Next, we fine-tune individual text-guided image-to-image LDMs to generate CXRs showing signs of pneumonia and bronchopneumonia. The Inception-V3 model is retrained on an updated data set that includes these synthesized images as part of augmented training and validation sets. Classification performance is compared using balanced accuracy, sensitivity, specificity, F-score, Matthews correlation coefficient (MCC), Kappa, and Youden's index against the baseline performance. Results show that the augmentation significantly improves Youden's index (p<0.05) and markedly enhances other metrics, indicating that data augmentation using LDM-synthesized images is an effective strategy for addressing class imbalance in medical image classification.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。