Abstract
Chromatin organization shapes gene regulation by linking distal elements across megabase scales, yet most predictive genomics models still treat the genome as linear, without incorporating three-dimensional structure. Hi-C provides genome-wide chromatin conformation information, but its contact maps are population-averaged, distance-biased, and noisy, obscuring the biologically specific contacts. We present CHROME, a framework built on a self-avoiding polymer ensemble null model that identifies physically specific, non-random Hi-C contacts. By integrating these contacts into graph representations, CHROME enables efficient information transfer across spatially connected loci. It integrates sequence, chromatin accessibility, or pre-trained embeddings into a graph attention architecture to predict cell line-specific ChIP-seq profiles, consistently outperforming local encoder baselines and generalizing to an unseen cell line. The resulting graph embeddings also enhance prediction on tissue-specific eQTL and ClinVar variant pathogenicity, outperforming local sequence-based embeddings. Beyond predictive performance, CHROME provides interpretability through attention-derived neighbor-to-center contributions that reveal how spatially connected loci influence local regulatory activity over multi-megabase distances. Together, these results show that incorporating physically validated chromatin interactions enables more accurate and interpretable modeling of gene regulation and variant effects.