Proformer: a multimodal proteomics transformer model for multidisease early risk assessment

Proformer:一种用于多疾病早期风险评估的多模态蛋白质组学转换模型

阅读:1

Abstract

Early identification of individuals at high risk for chronic diseases is crucial for prevention and intervention, yet current risk assessment tools are disease-specific, require extensive clinical data collection, and cannot provide multidisease risk profiles from a single measurement. Several protein large language models have been developed for tasks such as protein structure prediction, function prediction, and sequence design. However, none of these models can be directly applied in clinical settings to predict an individual's future disease risk. Here, we present a multimodal proteomics Transformer (Proformer) model that integrates protein expression, sequence, and function information for multidisease risk assessment. We trained Proformer using real proteomics data from 47 124 individuals from the UK Biobank to evaluate its performance in discriminating the risk of 20 common chronic diseases. Proformer achieved state-of-the-art (SOTA) performance in all 20 diseases compared with five common machine learning and deep learning models. Compared to three common clinical predictors, Proformer's 10-year discriminative performance outperforms Age + Sex model for 19 diseases, outperforms the ASCVD risk score for 16 diseases, and outperforms the panel composed of 35 clinical variables for 11 diseases. These results were replicated in the Scotland and Wales cohort from UK Biobank. In conclusion, Proformer enabled users to directly obtain a 10-year risk report for common chronic diseases by inputting their individual proteomics data.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。