Integration of biological data via NMF for identification of human disease-associated gene modules through multi-label classification

通过NMF整合生物数据,利用多标签分类识别人类疾病相关基因模块

阅读:1

Abstract

Proteins associated with multiple diseases often interact, forming disease modules that are critical for understanding disease mechanisms. This study integrates protein-protein interactions (PPIs) and Gene Ontology data using non-negative matrix factorization (NMF) to identify gene modules associated with human diseases. We leverage two biological sources of information, protein-protein interactions (PPIs) and Gene Ontology data, to find connections between novel genes and diseases. The data sources are first converted into networks, which are then clustered to obtain modules. Two types of modules are then integrated through an NMF-based technique to obtain a set of meta-modules that preserve the essential characteristics of interaction patterns and functional similarity information among the proteins/genes. Each meta-module is labeled based on its statistical and biological properties, and a multi-label classification technique is employed to assign new disease labels to genes. We identified 3,131 gene-disease associations, validated through a literature review, Gene Ontology, and pathway analysis.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。