MVP: a modular viromics pipeline to identify, filter, cluster, annotate, and bin viruses from metagenomes

MVP:一个模块化的病毒组学流程,用于从宏基因组中识别、过滤、聚类、注释和分类病毒。

阅读:2

Abstract

While numerous computational frameworks and workflows are available for recovering prokaryote and eukaryote genomes from metagenome data, only a limited number of pipelines are designed specifically for viromics analysis. With many viromics tools developed in the last few years alone, it can be challenging for scientists with limited bioinformatics experience to easily recover, evaluate quality, annotate genes, dereplicate, assign taxonomy, and calculate relative abundance and coverage of viral genomes using state-of-the-art methods and standards. Here, we describe Modular Viromics Pipeline (MVP) v.1.0, a user-friendly pipeline written in Python and providing a simple framework to perform standard viromics analyses. MVP combines multiple tools to enable viral genome identification, characterization of genome quality, filtering, clustering, taxonomic and functional annotation, genome binning, and comprehensive summaries of results that can be used for downstream ecological analyses. Overall, MVP provides a standardized and reproducible pipeline for both extensive and robust characterization of viruses from large-scale sequencing data including metagenomes, metatranscriptomes, viromes, and isolate genomes. As a typical use case, we show how the entire MVP pipeline can be applied to a set of 20 metagenomes from wetland sediments using only 10 modules executed via command lines, leading to the identification of 11,656 viral contigs and 8,145 viral operational taxonomic units (vOTUs) displaying a clear beta-diversity pattern. Further, acting as a dynamic wrapper, MVP is designed to continuously incorporate updates and integrate new tools, ensuring its ongoing relevance in the rapidly evolving field of viromics. MVP is available at https://gitlab.com/ccoclet/mvp and as versioned packages in PyPi and Conda.IMPORTANCEThe significance of our work lies in the development of Modular Viromics Pipeline (MVP), an integrated and user-friendly pipeline tailored exclusively for viromics analyses. MVP stands out due to its modular design, which ensures easy installation, execution, and integration of new tools and databases. By combining state-of-the-art tools such as geNomad and CheckV, MVP provides high-quality viral genome recovery and taxonomy and host assignment, and functional annotation, addressing the limitations of existing pipelines. MVP's ability to handle diverse sample types, including environmental, human microbiome, and plant-associated samples, makes it a versatile tool for the broader microbiome research community. By standardizing the analysis process and providing easily interpretable results, MVP enables researchers to perform comprehensive studies of viral communities, significantly advancing our understanding of viral ecology and its impact on various ecosystems.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。