Leveraging a Vision-Language Model with Natural Text Supervision for MRI Retrieval, Captioning, Classification, and Visual Question Answering

利用视觉语言模型和自然文本监督进行MRI图像检索、图像描述、分类和视觉问答

阅读:1

Abstract

Large multimodal models are now extensively used worldwide, with the most powerful ones trained on massive, general-purpose datasets. Despite their rapid deployment, concerns persist regarding the quality and domain relevance of the training data, especially in radiology, medical research, and neuroscience. Additionally, healthcare data privacy is paramount when querying models trained on medical data, as is transparency regarding service hosting and data storage. So far, most deep learning algorithms in radiologic research are designed to perform a specific task (e.g., diagnostic classification) and cannot be prompted to perform multiple tasks using natural language. In this work, we introduce a framework based on vector retrieval and contrastive learning to efficiently learn visual brain MRI concepts via natural language supervision. We show how the method learns to identify factors that affect the brain in Alzheimer's disease (AD) via joint embedding and natural language supervision. First, we pre-train separate text and image encoders using self-supervised learning, and jointly fine-tune these encoders to develop a shared embedding space. We train our model to perform multiple tasks, including MRI retrieval, MRI captioning, and MRI classification. We show its versatility by developing a retrieval and re-ranking mechanism along with a transformer decoder for visual question answering. CLINICAL RELEVANCE: By learning a cross-modal embedding of radiologic features and text, our approach can learn to perform diagnostic and prognostic assessments in AD research as well as to assist practicing clinicians. Integrating medical imaging with clinical descriptions and text prompts, we aim to provide a general, versatile tool for detecting radiologic features described by text, offering a new approach to radiologic research.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。