Automatic extraction, prioritization and analysis of gut microbial metabolites from biomedical literature

从生物医学文献中自动提取、优先排序和分析肠道微生物代谢物

阅读:1

Abstract

Many diseases are driven by gene-environment interactions. One important environmental factor is the metabolic output of human gut microbiota. A comprehensive catalog of human metabolites originated in microbes is critical for data-driven approaches to understand how microbial metabolism contributes to human health and diseases. Here we present a novel integrated approach to automatically extract and analyze microbial metabolites from 28 million published biomedical records. First, we classified 28,851,232 MEDLINE records into microbial metabolism-related or not. Second, candidate microbial metabolites were extracted from the classified texts. Third, we developed signal prioritization algorithms to further differentiate microbial metabolites from metabolites originated from other resources. Finally, we systematically analyzed the interactions between extracted microbial metabolites and human genes. A total of 11,846 metabolites were extracted from 28 million MEDLINE articles. The combined text classification and signal prioritization significantly enriched true positives among top: manual curation of top 100 metabolites showed a true precision of 0.55, representing a significant 38.3-fold enrichment as compared to the precision of 0.014 for baseline extraction. More importantly, 29% extracted microbial metabolites have not been captured by existing databases. We performed data-driven analysis of the interactions between the extracted microbial metabolite and human genetics. This study represents the first effort towards automatically extracting and prioritizing microbial metabolites from published biomedical literature, which can set a foundation for future tasks of microbial metabolite relationship extraction from literature and facilitate data-driven studies of how microbial metabolism contributes to human diseases.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。