DAIRYdb: a manually curated reference database for improved taxonomy annotation of 16S rRNA gene sequences from dairy products

DAIRYdb:一个人工整理的参考数据库,用于改进乳制品中16S rRNA基因序列的分类注释

阅读:2

Abstract

BACKGROUND: Reads assignment to taxonomic units is a key step in microbiome analysis pipelines. To date, accurate taxonomy annotation of 16S reads, particularly at species rank, is still challenging due to the short size of read sequences and differently curated classification databases. The close phylogenetic relationship between species encountered in dairy products, however, makes it crucial to annotate species accurately to achieve sufficient phylogenetic resolution for further downstream ecological studies or for food diagnostics. Curated databases dedicated to the environment of interest are expected to improve the accuracy and resolution of taxonomy annotation. RESULTS: We provide a manually curated database composed of 10'290 full-length 16S rRNA gene sequences from prokaryotes tailored for dairy products analysis ( https://github.com/marcomeola/DAIRYdb ). The performance of the DAIRYdb was compared with the universal databases Silva, LTP, RDP and Greengenes. The DAIRYdb significantly outperformed all other databases independently of the classification algorithm by enabling higher accurate taxonomy annotation down to the species rank. The DAIRYdb accurately annotates over 90% of the sequences of either single or paired hypervariable regions automatically. The manually curated DAIRYdb strongly improves taxonomic annotation accuracy for microbiome studies in dairy environments. The DAIRYdb is a practical solution that enables automatization of this key step, thus facilitating the routine application of NGS microbiome analyses for microbial ecology studies and diagnostics in dairy products.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。