Fast and robust estimate of bacterial genus novelty using the percentage of conserved proteins with unique matches (POCPu)

利用具有独特匹配的保守蛋白百分比(POCPu)快速、稳健地估计细菌属的新颖性

阅读:1

Abstract

Accurate taxonomic assignment of bacterial genomes is essential for identifying novel taxa and for stable classification to enable robust comparison between studies. Bacterial genus delineation relies on multiple lines of evidence, including phylogenetic trees and metrics like the percentage of conserved proteins (POCP). POCP is widely used, but requires benchmarking in terms of both, computation and accuracy. We used 2,358,466 pairwise comparisons of proteomes derived from 4,767 genomes across 35 families to systematically assess POCP calculation and percentage of conserved proteins with unique matches (POCPu) which considers unique matches only. Both methods are 20x faster than the reference BLASTP when using the very-sensitive setting of DIAMOND. However, POCPu differentiates better within-genus from between-genera values, which improves bacterial genus assignment. This work facilitates comparative analysis of an increasingly larger number of genomes, providing a reliable metric to support genus delineation. The findings suggest that specific POCPu thresholds deviating from the reference 50% value are needed for certain families.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。