Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing

基于模型的样本索引跳跃分析揭示了其在多重单细胞RNA测序中普遍存在的伪影

阅读:1

Abstract

Index hopping is the main cause of incorrect sample assignment of sequencing reads in multiplexed pooled libraries. We introduce a statistical model for estimating the sample index-hopping rate in multiplexed droplet-based single-cell RNA-seq data and for probabilistic inference of the true sample of origin of hopped reads. We analyze several datasets and estimate the sample index hopping probability to range between 0.003-0.009, a small number that counter-intuitively gives rise to a large fraction of phantom molecules - the fraction of phantom molecules exceeds 8% in more than 25% of samples and reaches as high as 85% in low-complexity samples. Phantom molecules lead to widespread complications in downstream analyses, including transcriptome mixing across cells, emergence of phantom copies of cells from other samples, and misclassification of empty droplets as cells. We demonstrate that our approach can correct for these artifacts by accurately purging the majority of phantom molecules from the data.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。