Abstract
Changes in the three-dimensional (3D) structure of the human genome are associated with various conditions, such as cancer and developmental disorders. Techniques like chromatin conformation capture (Hi-C) have been developed to study these global 3D structures, typically requiring millions of cells and an extremely high sequencing depth (around 1 billion reads per sample) for bulk Hi-C. In contrast, single-cell Hi-C (scHi-C) captures 3D structures at the individual cell level but faces significant data sparsity, characterized by a high proportion of zeros. scHi-C data enable the identification of cell types with distinct 3D structures; consequently, identifying differential chromatin interactions between such groups may offer insights into cell type-specific regulation. While differential analysis methods exist for bulk Hi-C data, they are limited for scHi-C data. To address this, we developed a method for differential scHi-C analysis, extending the HiCcompare R package. Our approach optionally imputes sparse scHi-C data by considering genomic distances and creates pseudo-bulk Hi-C matrices by summing condition-specific data. The data are normalized using locally estimated scatterplot smoothing (LOESS) regression, and differential chromatin interactions are detected via Gaussian Mixture Model (GMM) clustering. Our workflow outperforms existing methods in identifying differential chromatin interactions across various genomic distances, fold changes, resolutions, and sample sizes in both simulated and experimental contexts. This enables the effective detection of cell type-specific differences in chromatin structure and shows expected associations with biological and epigenetic features. Our method is implemented in the scHiCcompare R package, available at https://bioconductor.org/packages/scHiCcompare.