CountASAP: a lightweight, easy to use python package for processing ASAPseq data

CountASAP:一个轻量级、易于使用的 Python 包,用于处理 ASAPseq 数据。

阅读:3

Abstract

BACKGROUND: Declining sequencing costs coupled with the increasing availability of easy-to-use kits for the isolation of DNA and RNA transcripts from single cells have driven a rapid proliferation of studies centered around genomic and transcriptomic data. Simultaneously, a wealth of new techniques have been developed that utilize single cell technologies to interrogate a broad range of cell-biological processes. One recently developed technique, transposase-accessible chromatin with sequencing (ATAC) with select antigen profiling by sequencing (ASAPseq), provides a combination of chromatin accessibility assessments with measurements of cell-surface marker expression levels. While software exists for the characterization of these datasets, there currently exists no tool explicitly designed to reformat ASAP surface marker FASTQ data into a count matrix which can then be used for these downstream analyses. RESULTS: To address this lack of a dedicated tool for ASAPseq data processing, we created CountASAP, an easy-to-use Python package purposefully designed to transform FASTQ files from ASAP experiments into count matrices compatible with commonly-used downstream bioinformatic analysis packages. CountASAP takes advantage of the independence of the relevant data structures to perform fully parallelized matches of each sequenced read to user-supplied input ASAP oligos and unique cell-identifier sequences. We directly compare the performance and user-friendliness of CountASAP to existing tools using similarly-structured data from a more common sequencing experiment: cellular indexing of transcriptomes and epitopes by sequencing (CITEseq). Further benchmarking against existing tools helps to identify proper defaults for CountASAP and assess the agreement of outputs from all tested software. A final test using a novel ASAPseq dataset provides evidence that CountASAP can generate biologically meaningful results that correlate well with paired chromatin accessibility data. CONCLUSIONS: CountASAP shows good agreement with existing, well-tested data processing tools in the analysis of similarly-structured benchmarking data. CountASAP runs efficiently on a standard laptop, has user-friendly documentation, a one-step installation, and represents the first and only tool designed specifically for the processing of ASAPseq data.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。