Assessment of urban microbiome assemblies with the help of targeted in silico gold standards

借助靶向计算机模拟黄金标准评估城市微生物组组装情况

阅读:1

Abstract

BACKGROUND: Microbial communities play a crucial role in our environment and may influence human health tremendously. Despite being the place where human interaction is most abundant we still know little about the urban microbiome. This is highlighted by the large amount of unclassified DNA reads found in urban metagenome samples. The only in silico approach that allows us to find unknown species, is the assembly and classification of draft genomes from a metagenomic dataset. In this study we (1) investigate the applicability of an assembly and binning approach for urban metagenome datasets, and (2) develop a new method for the generation of in silico gold standards to better understand the specific challenges of such datasets and provide a guide in the selection of available software. RESULTS: We applied combinations of three assembly (Megahit, SPAdes and MetaSPAdes) and three binning tools (MaxBin, MetaBAT and CONCOCT) to whole genome shotgun datasets from the CAMDA 2017 Challenge. Complex in silico gold standards with a simulated bacterial fraction were generated for representative samples of each surface type and city. Using these gold standards, we found the combination of SPAdes and MetaBAT to be optimal for urban metagenome datasets by providing the best trade-off between the number of high-quality genome draft bins (MIMAG standards) retrieved, the least amount of misassemblies and contamination. The assembled draft genomes included known species like Propionibacterium acnes but also novel species according to respective ANI values. CONCLUSIONS: In our work, we showed that, even for datasets with high diversity and low sequencing depth from urban environments, assembly and binning-based methods can provide high-quality genome drafts. Of vital importance to retrieve high-quality genome drafts is sequence depth but even more so a high proportion of the bacterial sequence fraction too achieve high coverage for bacterial genomes. In contrast to read-based methods relying on database knowledge, genome-centric methods as applied in this study can provide valuable information about unknown species and strains as well as functional contributions of single community members within a sample. Furthermore, we present a method for the generation of sample-specific highly complex in silico gold standards. REVIEWERS: This article was reviewed by Craig Herbold, Serghei Mangul and Yana Bromberg.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。