ProtNote: a multimodal method for protein-function annotation

ProtNote:一种用于蛋白质功能注释的多模态方法

阅读:1

Abstract

MOTIVATION: Understanding the protein sequence-function relationship is essential for advancing protein biology and engineering. However, <1% of known protein sequences have human-verified functions. While deep-learning methods have demonstrated promise for protein-function prediction, current models are limited to predicting only those functions on which they were trained. RESULTS: Here, we introduce ProtNote, a multimodal deep-learning model that leverages free-form text to enable both supervised and zero-shot protein-function prediction. ProtNote not only maintains near state-of-the-art performance for annotations in its training set but also generalizes to unseen and novel functions in zero-shot test settings. ProtNote demonstrates superior performance in the prediction of novel Gene Ontology annotations and Enzyme Commission numbers compared to baseline models by capturing nuanced sequence-function relationships that unlock a range of biological use cases inaccessible to prior models. We envision that ProtNote will enhance protein-function discovery by enabling scientists to use free text inputs without restriction to predefined labels-a necessary capability for navigating the dynamic landscape of protein biology. AVAILABILITY AND IMPLEMENTATION: The code is available on GitHub: https://github.com/microsoft/protnote; model weights, datasets, and evaluation metrics are provided via Zenodo: https://zenodo.org/records/13897920.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。