The evaluation of different combinations of enzyme set, aligner and caller in GBS sequencing of soybean

大豆GBS测序中不同酶组、比对器和调用器组合的评估

阅读:1

Abstract

BACKGROUND: Genotype-by-sequencing (GBS) is a cost-effective method for large-scale genotyping, widely used across various species, particularly those with large genomes. A critical aspect of GBS lies in the selection of restriction enzymes for genome digestion and the optimization of data analysis pipelines. However, few studies have comprehensively examined the combined effects of enzyme choice and pipeline configuration. RESULTS: In this study, we created GBS libraries using three commonly used restriction enzyme combinations (HindIII-NlaIII, PstI-MspI, and ApeKI) and assessed multiple SNP-calling pipelines in 15 soybean varieties. We tested four aligners (BWA-MEM, Bowtie2, BBMap, and Strobealign) and seven SNP callers (Bcftools, Stacks, DeepVariant, FreeBayes, VarScan, BBCallVariants, and GATK). Our finding reveal that enzyme choice significantly influences the number of identified SNP, gene localization preferences, and accuracy. Furthermore, the performance of SNP callers varied markedly in terms of SNP count, precision, recall, and false discovery rate (FDR). DeepVariant exhibited the highest accuracy, with 76.0% of its SNPs intersecting with whole-genome sequencing (WGS)-derived SNPs and an FDR of 0.0095, compared to FreeBayes, which had 47.8% intersection and an FDR of 0.6321. CONCLUSIONS: Our findings underscore the importance of optimizing both enzyme selection for sequencing libraries and data analysis pipelines to ensure robust and reproducible results. This study provides a general framework for designing large-scale genotyping experiments aimed to specific quality and quantity requirements in various plant species.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。