Abstract
BACKGROUND: Data from Single-cell Assay for Transposase Accessible Chromatin with Sequencing (scATAC-seq) is highly sparse. While current computational methods feature a range of transformation procedures to extract meaningful information, major challenges remain. RESULTS: Here, we discuss the major scATAC-seq data analysis challenges such as sequencing depth normalization and region-specific biases. We present a hierarchical count model that is motivated by the data generating process of scATAC-seq data. Our simulations show that current scATAC-seq data, while clearly containing physical single-cell resolution, are too sparse to infer true informational-level single-cell, single-region of chromatin accessibility states. CONCLUSIONS: While the broad utility of scATAC-seq at a cell type level is undeniable, describing it as fully resolving chromatin accessibility at single-cell resolution, particularly at individual locus level, may overstate the level of detail currently achievable. We conclude that chromatin accessibility profiling at true single-cell, single-region resolution is challenging with current data sensitivity, but that it may be achieved with promising developments in optimizing the efficiency of scATAC-seq assays.