Towards a cardiovascular magnetic resonance foundation model for multi-task cardiac image analysis

面向多任务心脏图像分析的心血管磁共振基础模型

阅读:1

Abstract

BACKGROUND: Cardiovascular magnetic resonance (CMR) is a complex imaging modality requiring a broad variety of image processing tasks for comprehensive assessment of the study. Recently, foundation models (FM) have shown promise for automated image analyses in natural images (NI). In this study, a CMR-specific vision FM was developed and then finetuned in a supervised manner for nine different imaging tasks typical to a CMR workflow, including classification, segmentation, landmark localization, and pathology detection. METHODS: A ViT-S/8 model was trained in a self-supervised manner using DINO on 36 million CMR images from 27,524 subjects from three sources (UK Biobank and two clinical centers). The model was then finetuned for nine tasks: classification (sequence, cine view), segmentation (cine SAX, cine LAX, LGE SAX, Mapping SAX), landmark localization, pathology detection (LGE, cardiac disease), on data from various sources (both public and three clinical datasets). The results were compared against metrics from state-of-the-art methods on the same tasks. A comparable baseline model was also trained on the same datasets for direct comparison. Additionally, the effect of pretraining strategy, as well as generalization and few-shot performance (training on few labeled samples) was explored for the pretrained model, compared to the baseline. RESULTS: The proposed model obtained similar performance or moderate improvements to results reported in the literature in most tasks (except disease detection), without any task-specific optimization of methodology. The proposed model outperformed the baseline in most cases, with an average increase of 6.8% points (pp) for cine view classification, and 0.1 to 1.8 pp for segmentation tasks. The proposed method also obtained generally lower standard deviations in the metrics. Improvements of 3.7 and 6.6 pp for hyperenhancement detection from LGE and 14 pp for disease detection were observed. Ablation studies highlighted the importance of pretraining strategy, architecture, and the impact of domain shifts from pretraining to finetuning. Moreover, CMR-pretrained model achieved better generalization and few-shot performance compared to the baseline. CONCLUSIONS: Vision FM specialized for medical imaging can improve accuracy and robustness over NI-FM. Self-supervised pretraining offers a resource-efficient, unified framework for CMR assessment, with the potential to accelerate the development of deep learning-based solutions for image analysis tasks, even with few annotated data available.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。