Synthetic Breast Ultrasound Images: A Study to Overcome Medical Data Sharing Barriers

合成乳腺超声图像:一项旨在克服医疗数据共享障碍的研究

阅读:1

Abstract

The vast potential of medical big data to enhance healthcare outcomes remains underutilized due to privacy concerns, which restrict cross-center data sharing and the construction of diverse, large-scale datasets. To address this challenge, we developed a deep generative model aimed at synthesizing medical data to overcome data sharing barriers, with a focus on breast ultrasound (US) image synthesis. Specifically, we introduce CoLDiT, a conditional latent diffusion model with a transformer backbone, to generate US images of breast lesions across various Breast Imaging Reporting and Data System (BI-RADS) categories. Using a training dataset of 9,705 US images from 5,243 patients across 202 hospitals with diverse US systems, CoLDiT generated breast US images without duplicating private information, as confirmed through nearest-neighbor analysis. Blinded reader studies further validated the realism of these images, with area under the receiver operating characteristic curve (AUC) scores ranging from 0.53 to 0.77. Additionally, synthetic breast US images effectively augmented the training dataset for BI-RADS classification, achieving performance comparable to that using an equal-sized training set comprising solely real images (P = 0.81 for AUC). Our findings suggest that synthetic data, such as CoLDiT-generated images, offer a viable, privacy-preserving solution to facilitate secure medical data sharing and advance the utilization of medical big data.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。