Impact of wet-lab protocols on quality of whole-genome short-read sequences from foodborne microbial pathogens

湿实验室协议对食源性微生物病原体全基因组短读序列质量的影响

阅读:10
作者:Leonie F Forth, Erik Brinks, Grégoire Denay, Ahmad Fawzy, Stefan Fiedler, Jannika Fuchs, Anne-Catrin Geuthner, Thomas Hankeln, Ekkehard Hiller, Larissa Murr, Henning Petersen, Ralf Reiting, Christian Schäfers, Claudia Schwab, Kathrin Szabo, Andrea Thürmer, Anne Wöhlke, Jennie Fischer, Stefanie Lüth,

Abstract

For successful elucidation of a food-borne infection chain, the availability of high-quality sequencing data from suspected microbial contaminants is a prerequisite. Commonly, those investigations are a joint effort undertaken by different laboratories and institutes. To analyze the extent of variability introduced by differing wet-lab procedures on the quality of the sequence data we conducted an interlaboratory study, involving four bacterial pathogens, which account for the majority of food-related bacterial infections: Campylobacter spp., Shiga toxin-producing Escherichia coli, Listeria monocytogenes, and Salmonella enterica. The participants, ranging from German federal research institutes, federal state laboratories to universities and companies, were asked to follow their routine in-house protocols for short-read sequencing of 10 cultures and one isolated bacterial DNA per species. Sequence and assembly quality were then analyzed centrally. Variations within isolate samples were detected with SNP and cgMLST calling. Overall, we found that the quality of Illumina raw sequence data was high with little overall variability, with one exception, attributed to a specific library preparation kit. The variability of Ion Torrent data was higher, independent of the investigated species. For cgMLST and SNP analysis results, we found that technological sequencing artefacts could be reduced by the use of filters, and that SNP analysis was more suited than cgMLST to compare data of different contributors. Regarding the four species, a minority of Campylobacter isolate data showed the in comparison highest divergence with regard to sequence type and cgMLST analysis. We additionally compared the assembler SPAdes and SKESA for their performance on the Illumina data sets of the different species and library preparation methods and found overall similar assembly quality metrics and cgMLST statistics.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。