Abstract
MOTIVATION: Modern genomic research is driven by next-generation sequencing experiments such as ChIP-seq, CUT&Tag, and CUT&RUN that generate coverage files for transcription factor binding, as well as ATAC-seq that yield coverage files for chromatin accessibility. Due to the inherent technical noise present in the experimental protocols, researchers need statistically rigorous and computationally efficient methods to extract true biological signal from a mixture of signal and noise. However, existing approaches are often computationally demanding or require input or spike-in controls. RESULTS: We developed Chrom-Sig, a Python package to quickly de-noise 1D genomic coverage tracks by computing the empirical null distribution without prior assumptions or experimental controls. When tested on 19 ChIP-seq, CUT&RUN, ATAC-seq, and snATAC-seq datasets, Chrom-Sig can effectively decompose the data into signal and noise components. Notably, Chrom-Sig performs de-noising and peak calling in 1-2 h using around 20 GB of memory. The de-noised signal corroborates with biologically meaningful results: CTCF CUT&RUN data retained a high percentage of peaks overlapping CTCF binding motifs, while ATAC-seq and RNA Polymerase II data were enriched in enhancers and promoters. We envision Chrom-Sig to be a versatile and general tool for current and future genomic technologies. AVAILABILITY AND IMPLEMENTATION: Chrom-Sig is publicly available on GitHub (https://github.com/minjikimlab/chromsig) and Zenodo (doi: 10.5281/zenodo.17488772) under the MIT licence.