Abstract
MOTIVATION: scATAC-seq enables high-resolution mapping of cis-regulatory elements. It has been widely applied to uncover cell-type-specific regulatory networks and complement scRNA-seq analysis in numerous studies. However, a large number of datasets generated by scATAC-seq remain underutilized due to limited exploration of super-enhancers/typical enhancers and gene markers. A comprehensive resource enabling cell-type-specific annotation of cis-regulatory elements and their dynamic enhancer-gene linkages remains an urgent unmet need for scATAC-seq. RESULTS: We present dbscATAC, a specialized single-cell database for annotating super-enhancers, gene markers, and enhancer-gene interactions derived from scATAC-seq data. Using improved machine learning algorithms, we identified 213 835 super-enhancers across 520 tissue/cell types from three species, as well as 347 484 gene markers, 13 470 526 enhancers, and 10 402 346 enhancer-gene interactions derived from 1 668 076 single cells spanning 1028 tissue/cell types in 13 species. An easy-to-use online platform with multiple analytic modules and hierarchical query options was developed for searching, browsing and visualizing single-cell super-enhancers, enhancers, and gene markers. dbscATAC provides a comprehensive resource to facilitate the exploration of enhancer landscapes, gene regulation, and cell-type-specific characteristics in single-cell epigenomics. AVAILABILITY AND IMPLEMENTATION: The database with all the super-enhancer/enhancer annotation data is available at http://singlecelldb.com/dbscATAC/index.php. And the source code of dbscATAC for prediction of SEs, enhancers, and gene markers are available at https://github.com/EvansGao/dbscATAC. The source code, tissue/cell type description, and data summary can be downloaded at DOI: 10.6084/m9.figshare.28706414.scATAC-seq, Database, Super-enhancers/enhancers, Gene markers.