Gene set proximity analysis: expanding gene set enrichment analysis through learned geometric embeddings, with drug-repurposing applications in COVID-19

基因集邻近性分析:通过学习几何嵌入扩展基因集富集分析,并应用于新冠肺炎药物再利用

阅读:1

Abstract

MOTIVATION: Gene set analysis methods rely on knowledge-based representations of genetic interactions in the form of both gene set collections and protein-protein interaction (PPI) networks. However, explicit representations of genetic interactions often fail to capture complex interdependencies among genes, limiting the analytic power of such methods. RESULTS: We propose an extension of gene set enrichment analysis to a latent embedding space reflecting PPI network topology, called gene set proximity analysis (GSPA). Compared with existing methods, GSPA provides improved ability to identify disease-associated pathways in disease-matched gene expression datasets, while improving reproducibility of enrichment statistics for similar gene sets. GSPA is statistically straightforward, reducing to a version of traditional gene set enrichment analysis through a single user-defined parameter. We apply our method to identify novel drug associations with SARS-CoV-2 viral entry. Finally, we validate our drug association predictions through retrospective clinical analysis of claims data from 8 million patients, supporting a role for gabapentin as a risk factor and metformin as a protective factor for severe COVID-19. AVAILABILITY AND IMPLEMENTATION: GSPA is available for download as a command-line Python package at https://github.com/henrycousins/gspa. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。