cpiVAE: Robust and Interpretable Cross-Platform Proteomics Imputation

cpiVAE:稳健且可解释的跨平台蛋白质组学插补

阅读:1

Abstract

Large-scale plasma proteomic studies often use different high-throughput affinity platforms, and measurements of the same protein across platforms are often discordant. Discordance hinders cross-study integration. Improving proteomics data integration would enable more powerful meta-analyses, improve statistical power for biomarker discovery, and provide a better understanding of proteome-phenotype relationships. Here we present a cross-platform proteomics imputation variational autoencoder (cpiVAE), a deep generative model for bidirectional imputation of protein abundances between two widely used platforms: Olink and SomaScan. Using a training cohort of paired measurements from the China Kadoorie Biobank (CKB), cpiVAE learns a joint latent representation that enables cross-platform imputation. The cpiVAE method improves benchmarks provided by established methods, k-nearest neighbors (KNN) Weighted Nearest Neighbors (WNN, from Seurat v4). The cpiVAE method achieves up to 30% higher correlation between imputed and true values than KNN and WNN. The cpiVAE method also generalizes well to an independent cohort from the Atherosclerosis Risk in Communities Study (ARIC). Without retraining, cpiVAE maintains high performance compared to benchmarks. Associations of imputed protein levels with clinical phenotypes closely mirror results using the actual measurements and increases power in a meta-analysis scenario. A post-hoc feature importance matrix enables interpretation of this AI model. Protein pair features extracted from cpiVAE have significant overlap with known associations in the Search Tool for the Retrieval of Interacting Genes (STRING) database. In summary, cpiVAE offers an accurate, generalizable, and interpretable solution for cross-platform proteomic imputation, enabling integrated analyses across studies with proteomics measured on different platforms. This user-friendly framework and pre-trained model weights are available under a BSD2 open source license at https://github.com/joelbaderlab/cpiVAE_v1.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。