Bacterial proteome foundation model enhances functional prediction from enzymes to ecological interactions

细菌蛋白质组基础模型增强了从酶到生态相互作用的功能预测

阅读:2

Abstract

Bacteria play fundamental roles in ecosystems, human health, and biotechnology. Although bacterial genome sequencing data have accumulated rapidly over the past decade, the metabolic and ecological functions carried out by most sequenced bacteria remain poorly understood, apart from a few well-studied taxa and traits. Establishing a general framework that comprehensively captures the relationship between bacterial genomes and the diverse biological functions they encode remains a major challenge, as it requires embedding individual genes within their broader genomic context and modeling their combined effects across complex biological pathways and networks. The difficulty is further compounded by the limited functional annotations available for most bacterial genomes. Here, we introduce BacPT, a proteome foundation model trained on tens of thousands of complete genomes spanning diverse bacterial taxa. BacPT captures both local and genome-wide information, enabling the generation of contextualized gene embeddings and functionally rich representations of the whole genome. We demonstrate the utility of BacPT across diverse prediction tasks spanning multiple biological scales. BacPT embeddings improve the prediction of enzyme activities, biosynthetic gene clusters, metabolic traits, and ecological interaction outcomes. Our results highlight that unsupervised deep learning applied at the scale of entire proteomes provides a powerful approach for characterizing gene interactions and mapping functional landscapes for bacteria.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。