On the transfer learning behavior of domain-specific vision-language models in screening mammography

关于特定领域视觉语言模型在乳腺癌筛查中的迁移学习行为

阅读:1

Abstract

Vision-Language models have shown remarkable performance for natural images and text. Given the homology of the anatomy, high gray-scale image dimension, and the unbalanced datasets, the traditional VLMs do not adapt well to radiological applications. In this work, we empirically adapted image encoder trained within domain-specific VLMs to be applied in two downstream tasks for 2D mammogram image analysis: tissue density estimation and BI-RADS prediction. We study the transfer learning behavior using linear probing, fine-tuning, and online self distillation. We analyze that knowledge driven domain-specific VLM backbones with frozen weights perform better than MammoClip VLM model as well as supervised baselines such as ViT and CNNs even with only 5% of training data. Generalization capabilities are further studied of these models on two external datasets.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。