Disease-specific variant pathogenicity prediction using multimodal biomedical language models

利用多模态生物医学语言模型进行疾病特异性变异致病性预测

阅读:1

Abstract

Missense variants play a key role in the diagnosis of genetic disorders and in disease risk prediction. Existing methods focus primarily on the prediction of variant effects in terms of their deleteriousness, without taking into account the disease-specific context, and are therefore limited in terms of their utility in real-world diagnosis and decision making. Here, we introduce disease-specific variant pathogenicity prediction (DIVA), a novel deep learning framework that directly predicts specific disease types alongside the probability of deleteriousness for missense variants. Our approach integrates information from two different modalities - protein sequence and disease-related textual annotations - encoded using two pre-trained language models and optimized within a contrastive learning paradigm designed to align variants with relevant diseases in the learned representation space. Our results demonstrate that DIVA outperforms baselines and provides accurate disease predictions with high relevance to clinically curated disease annotations for missense variants. Variant deleteriousness prediction is enhanced by incorporating AlphaMissense scores through learnable weights derived from protein function annotations, which additionally boosts DIVA's ability to accurately classify deleterious variants. Our work provides new insights into variant pathogenicity prediction with awareness of disease specificity, addressing a hitherto unmet need in relation to clinical variant interpretation.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。