Abstract
MOTIVATION: Chromatin domain annotation identifies functional genomic regions, such as active and inactive zones, based on epigenomic features like histone modifications, DNA methylation, and chromatin accessibility. While recent methods have utilized both chromatin interaction data (e.g. Hi-C) and epigenomic data, they often overlook the direct relationship between these data types. RESULTS: In this study, we introduce Chromatin Domain Annotation using Contrastive Learning for Hi-C and Epigenomic Data (CDACHIE), a method for identifying chromatin domains from Hi-C and epigenomic data. Our approach leverages contrastive learning to generate aligned representative vectors for both data types at each genomic bin. The concatenated vectors are then clustered using K-means to classify distinct chromatin domain types. CDACHIE achieves superior performance in Variance Explained, evaluated across gene expression, replication timing, and ChIA-PET data. This highlights its robust ability to integrate semantic associations between Hi-C and epigenomic features within the embedding space. AVAILABILITY AND IMPLEMENTATION: The source code is available at GitHub: https://github.com/maruyama-lab-design/CDACHIE. An archival snapshot of the code used in this study is available on Zenodo: https://doi.org/10.5281/zenodo.15751780.