MinION Sequencing of Yeast Mock Communities To Assess the Effect of Databases and ITS-LSU Markers on the Reliability of Metabarcoding Analysis

使用 MinION 对酵母模拟群落进行测序,以评估数据库和 ITS-LSU 标记对 Metabarcoding 分析可靠性的影响

阅读:13
作者:Angela Conti #, Debora Casagrande Pierantoni #, Vincent Robert, Laura Corte, Gianluigi Cardinali

Abstract

Microbial communities play key roles both for humans and the environment. They are involved in ecosystem functions, maintaining their stability, and provide important services, such as carbon cycle and nitrogen cycle. Acting both as symbionts and as pathogens, description of the structure and composition of these communities is important. Metabarcoding uses ribosomal DNA (rDNA) (eukaryotic) or rRNA gene (prokaryotic) sequences for identification of species present in a site and measuring their abundance. This procedure requires several technical steps that could be source of bias producing a distorted view of the real community composition. In this work, we took advantage of an innovative "long-read" next-generation sequencing (NGS) technology (MinION) amplifying the DNA spanning from the internal transcribed spacer (ITS) to large subunit (LSU) that can be read simultaneously in this platform, providing more information than "short-read" systems. The experimental system consisted of six fungal mock communities composed of species present at various relative amounts to mimic natural situations characterized by predominant and low-frequency species. The influence of the sequencing platform (MinION and Illumina MiSeq) and the effect of different reference databases and marker sequences on metagenomic identification of species were evaluated. The results showed that the ITS-based database provided more accurate species identification than LSU. Furthermore, a procedure based on a preliminary identification with standard reference databases followed by the production of custom databases, including only the best outputs of the first step, is proposed. This additional step improved the estimate of species proportion of the mock communities and reduced the number of ghost species not really present in the simulated communities. IMPORTANCE Metagenomic analyses are fundamental in many research areas; therefore, improvement of methods and protocols for the description of microbial communities becomes more and more necessary. Long-read sequencing could be used for reducing biases due to the multicopy nature of rDNA sequences and short-read limitations. However, these novel technologies need to be assessed and standardized with controlled experiments, such as mock communities. The interest behind this work was to evaluate how long reads performed identification and quantification of species mixed in precise proportions and how the choice of database affects such analyses. Development of a pipeline that mitigates the effect of the barcoding sequences and the impact of the reference database on metagenomic analyses can help microbiome studies go one step further.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。