phyloFlash: Rapid Small-Subunit rRNA Profiling and Targeted Assembly from Metagenomes

phyloFlash:从宏基因组中快速进行小亚基 rRNA 分析和靶向组装

阅读:1

Abstract

The small-subunit rRNA (SSU rRNA) gene is the key marker in molecular ecology for all domains of life, but it is largely absent from metagenome-assembled genomes that often are the only resource available for environmental microbes. Here, we present phyloFlash, a pipeline to overcome this gap with rapid, SSU rRNA-centered taxonomic classification, targeted assembly, and graph-based binning of full metagenomic assemblies. We show that a cleanup of artifacts is pivotal even with a curated reference database. With such a filtered database, the general-purpose mapper BBmap extracts SSU rRNA reads five times faster than the rRNA-specialized tool SortMeRNA with similar sensitivity and higher selectivity on simulated metagenomes. Reference-based targeted assemblers yielded either highly fragmented assemblies or high levels of chimerism, so we employ the general-purpose genomic assembler SPAdes. Our optimized implementation is independent of reference database composition and has satisfactory levels of chimera formation. phyloFlash quickly processes Illumina (meta)genomic data, is straightforward to use, even as part of high-throughput quality control, and has user-friendly output reports. The software is available at https://github.com/HRGV/phyloFlash (GPL3 license) and is documented with an online manual.IMPORTANCE To track organisms across all domains of life, the SSU rRNA gene is the gold standard. Many environmental microbes are known only from high-throughput sequence data, but the SSU rRNA gene, the key to visualization by molecular probes and link to existing literature, is often missing from metagenome-assembled genomes (MAGs). The easy-to-use phyloFlash software suite tackles this gap with rapid, SSU rRNA-centered taxonomic classification, targeted assembly, and graph-based linking to MAGs. Starting from a cleaned reference database, phyloFlash profiles the taxonomic diversity and assembles the sorted SSU rRNA reads. The phyloFlash design is domain agnostic and covers eukaryotes, archaea, and bacteria alike. phyloFlash also provides utilities to visualize multisample comparisons and to integrate the recovered SSU rRNAs in a metagenomics workflow by linking them to MAGs using assembly graph parsing.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。