Abstract
BACKGROUND: Cis-regulatory modules (CRMs) such as enhancers and silencers play critical roles in virtually all biological processes by enhancing and repressing, respectively, the transcription of their target genes in specific cell types. Although numerous CRMs have been predicted in genomes, identifying their target genes remains a challenge due to low quality of the predicted CRMs and the fact that CRMs often do not regulate their closest genes. RESULTS: We developed a method - correlation and physical proximity (CAPP) by leveraging our recently predicted 1.2 M CRMs in the human genome. CAPP is able to not only predict the CRMs' target genes but also their functional types using only chromatin accessibility (CA) and RNA-seq data in a panel of cell/tissue types plus Hi-C data in a few cell types. Applying CAPP to a panel of only 107 cell/tissue types with CA and RNA-seq data available, we predict target genes for 14.3% of the 1.2 M CRMs, of which 1.4% are predicted as both enhancers and silencers (dual functional CRMs), 98.2% as exclusive enhancers, and 0.4% as exclusive silencers. Dual functional CRMs tend to regulate more distant genes than exclusive enhancers and silencers. Enhancers tend to cooperate with other enhancers, whereas silencers typically act independently. Silencers preferentially regulate genes expressed across many cell/tissue types, while enhancers are prone to regulate genes expressed in fewer cell/tissue types. CONCLUSIONS: CAPP represents a significant advancement in predicting target genes and functional types of CRMs, especially dual functional CRMs, and different types of CRMs show distinct properties.