PARAS: High-Accuracy Machine Learning of Substrate Specificities in Nonribosomal Peptide Synthetases

PARAS:非核糖体肽合成酶底物特异性的高精度机器学习

阅读:1

Abstract

Nonribosomal peptides are diverse natural products with important applications in medicine and agriculture. Bacterial and fungal genomes contain thousands of nonribosomal peptide biosynthetic gene clusters (BGCs) of unknown function, providing a promising resource for peptide discovery. Core structural features of such peptides can be inferred by predicting the substrate(s) of adenylation (A) domains in nonribosomal peptide synthetases (NRPSs). However, existing approaches to A domain prediction rely on limited data sets and often struggle with domains selecting large substrates and domains from underrepresented taxa. Here, we systematically curate and computationally analyze 3653 A domains and present two high-accuracy specificity predictors, PARAS and PARASECT. A type of A domain with unusually high l-tryptophan specificity was identified through the application of PARAS. Cloning and expression of the biosynthetic gene cluster encoding the NRPS showed that it directs the biosynthesis of tryptopeptin-related metabolites in Streptomyces species. Together, these technologies will accelerate the characterization of novel NRPSs and their metabolic products. PARAS and PARASECT are available at https://paras.bioinformatics.nl.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。