Fast and accurate matching of cellular barcodes across short-reads and long-reads of single-cell RNA-seq experiments

Abstract

Single-cell RNA sequencing allows for characterizing the gene expression landscape at the cell type level. However, because of its use of short-reads, it is severely limited at detecting full-length features of transcripts such as alternative splicing. New library preparation techniques attempt to extend single-cell sequencing by utilizing both long-reads and short-reads. These techniques split the library material, after it is tagged with cellular barcodes, into two pools: one for short-read sequencing and one for long-read sequencing. However, the challenge of utilizing these techniques is that they require matching the cellular barcodes sequenced by the erroneous long-reads to the cellular barcodes detected by the short-reads. To overcome this challenge, we introduce scTagger, a computational method to match cellular barcodes data from long-reads and short-reads. We tested scTagger against another state-of-the-art tool on both real and simulated datasets, and we demonstrate that scTagger has both significantly better accuracy and time efficiency. Keywords: Bioinformatics; Genomics; Sequence analysis.

期刊：	iScience	影响因子：	4.600
时间：	2022	起止号：	2022 Jun 7;25(7):104530.
doi：	10.1016/j.isci.2022.104530	研究方向：	细胞生物学
细胞类型：	其它细胞

Fast and accurate matching of cellular barcodes across short-reads and long-reads of single-cell RNA-seq experiments

快速准确地匹配单细胞RNA测序实验中短读长和长读长序列的细胞条形码

Abstract

特别声明