The effects of sequencing strategies on Metagenomic pathogen detection using bronchoalveolar lavage fluid samples

测序策略对支气管肺泡灌洗液样本宏基因组病原体检测的影响

阅读:18
作者:Ziyang Li, Zhe Guo, Weimin Wu, Li Tan, Qichen Long, Han Xia, Min Hu

Conclusions

Increasing dataset sizes and read lengths can enhance the performance of mNGS in pathogen detection but increase the economic and time costs of sequencing and data analysis. Currently, the 20 M reads in SE75 mode may be the best sequencing option.

Methods

In this study, 43 clinical BALF samples, which tested positive via clinical mNGS and were consistent with the diagnosis, were subjected to re-sequencing on the Illumina NovaSeq 6000 platform. The raw re-sequencing data, consisting of 100 million (M) paired-end 150 bp (PE150) reads, were divided into simulated datasets with eight different data sizes (5 M, 10 M, 15 M, 20 M, 30 M, 50 M, 75 M, 100 M) and five different read lengths (single-end 50 bp (SE50), SE75, SE100, PE100, and PE150). Both Kraken2 and IDseq bioinformatics pipelines were employed to analyze the previously diagnosed pathogens in the simulated data. Detection of pathogens was based on read counts ranging from 1 to 10 and RPM values ranging from 0.2 to 2.

Results

Our results revealed that increasing dataset sizes and read lengths can enhance the performance of mNGS in pathogen detection. However, a larger data sizes for mNGS require higher economic costs and longer turnaround time for data analysis. Our findings indicate 20 M reads being sufficient for SE75 mode to achieve high recall rates. Additionally, high nucleic acid loads in samples can lead to increased stability in pathogen detection efficiency, reducing the impact of sequencing strategies. The choice of bioinformatics pipelines had a significant impact on recall rates achieved in pathogen detection. Conclusions: Increasing dataset sizes and read lengths can enhance the performance of mNGS in pathogen detection but increase the economic and time costs of sequencing and data analysis. Currently, the 20 M reads in SE75 mode may be the best sequencing option.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。