Alignment-free unique molecular identifier clustering suppresses sequencing errors for accurate detection of low-frequency DNA variants

无需比对的独特分子标识符聚类可抑制测序错误,从而准确检测低频DNA变异。

阅读:1

Abstract

Accurate detection of low-frequency DNA variants (below 1%) is essential in diverse biological and clinical contexts, yet remains fundamentally constrained by the high intrinsic error rates of next-generation sequencing technologies. Although unique molecular identifiers (UMIs) have significantly mitigated these errors by uniquely indexing original template molecules, their efficacy is compromised by UMI collisions and by artifacts introduced during polymerase chain reaction (PCR) amplification and sequencing, which collectively engender false-positive variant calls. Here, we present AFUMIC, an alignment-free UMI clustering framework that systematically addresses these limitations through collision-resilient UMI grouping and a consensus quality score (CQS)-guided strategy for high-fidelity consensus sequence generation. AFUMIC reduces singleton families, enhances clustering precision, and maximizes data retention, yielding 7.27-fold and 3.84-fold increases in single-strand consensus sequence and duplex consensus sequence output, respectively, compared to Du Novo. It further decreases the per-base error rate from $3.01 \times 10^{-4}$ to $2.10 \times 10^{-5}$ and raises the proportion of error-free positions from 45.27% to 99.85%, enabling confident detection of variants at variant allele frequencies as low as $1.0 \times 10^{-5}$. Notably, AFUMIC exhibits superior computational efficiency, rendering it well-suited for high-throughput analysis of UMI-tagged libraries in large-scale genomic studies. Collectively, AFUMIC represents an efficient methodology for ultrasensitive variant detection and establishes a broadly applicable and computationally efficient framework for error-corrected sequencing that can be readily deployed in both clinical diagnostics and large-scale genomic research.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。