Background Filtering of Clinical Metagenomic Sequencing with a Library Concentration-Normalized Model

使用文库浓度归一化模型进行临床宏基因组测序的背景过滤

阅读:6
作者:Juan Du #, Jingjia Zhang #, Dong Zhang #, Yiwen Zhou #, Pengfei Wu, Wenchao Ding, Jun Wang, Chuan Ouyang, Qiwen Yang

Abstract

Metagenomic next-generation sequencing (mNGS) can accurately detect pathogens in clinical samples. However, wet-lab contamination constrains mNGS analysis and may result in erroneous interpretation of results. Many existing methods rely on large-scale observational microbiome studies and may not be applicable to clinical mNGS tests. By generation of a pretrained profile of common laboratory contaminants, we developed an mNGS noise-filtering model based on the inverse linear relationship between microbial sequencing reads and sample library concentration, named the background elimination and correction by library concentration-normalized (BECLEAN) model. Its efficacy was evaluated with bacteria- and yeast-spiked samples and 28 cerebrospinal fluid (CSF) specimens. The diagnostic accuracy, precision, sensitivity, and specificity of BECLEAN with reference to conventional methods and diagnosis were 92.9%, 86.7%, 100%, and 86.7%, respectively. BECLEAN led to a dramatic reduction of background noise without affecting the true-positive rate and thus can provide a time-saving and convenient tool in various clinical settings. IMPORTANCE Most of the existing methods to remove wet-lab contamination rely on large-scale observational microbiome studies and may not be applicable to clinical mNGS testing in individual cases. In clinical settings, only a handful of samples might be sequenced in a run. The lab-specific microbiome can complicate existing statistical approaches for removing contamination from small-scale clinical metagenomic sequencing data sets; thus, use of a preliminary lab-specific training set is necessary. Our study provides a rapid and accurate background-filtering tool for clinical metagenomic sequencing by generation of a pretrained profile of common laboratory contaminants. Notably, our work demonstrates that the inverse linear relationship between microbial sequencing reads and library concentration can serve to identify true contaminants and evaluate the relative abundance of a taxon in samples by comparing the observed microbial reads to the model-predicted value. Our findings extend the previously published research and demonstrate confirmatory results in clinical settings.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。