The use of taxon-specific reference databases compromises metagenomic classification

使用特定分类群参考数据库会损害宏基因组分类。

阅读:1

Abstract

A recent article in BMC Genomics describes a new bioinformatics tool, HumanMycobiomeScan, to classify fungal taxa in metagenomic samples. This tool was used to characterize the gut mycobiome of hunter-gatherers and Western populations, resulting in the identification of a range of fungal species in the vast majority of samples. In the HumanMycobiomeScan pipeline, sequence reads are mapped against a reference database containing fungal genome sequences only. We argue that using reference databases comprised of a single taxonomic group leads to an unacceptably high number of false-positives due to: (i) mapping to conserved genetic regions in reference genomes, and (ii) sequence contamination in the assembled reference genomes. To demonstrate this, we replaced the HumanMycobiomeScan's fungal reference database with one containing genome sequences of amphibians and reptiles and re-analysed their case study. The classification pipeline recovered all species present in the reference database, revealing turtles (Geoemydidae), bull frogs (Pyxicephalidae) and snakes (Colubridae) as the most abundant herpetological taxa in the human gut. We also re-analysed their case study using a kingdom-agnostic pipeline. This revealed that while the gut of hunter-gatherers and Western subjects may be colonized by a range of microbial eukaryotes, only three fungal families were retrieved. These results highlight the pitfalls of using taxon-specific reference databases for metagenome classification, even when they are comprised of curated whole genome data. We propose that databases containing all domains of life provide the most suitable option for metagenomic species profiling, especially when targeting microbial eukaryotes.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。