Abstract
Tissue-specific gene regulation in mammals involves the coordinated binding of multiple transcription factors (TFs). Using the forebrain as a model, we investigated the syntax of TF occupancy to determine tissue-specific enhancer regions. We analyzed forebrain-exclusive enhancers from the VISTA Enhancer Browser and a curated set of 23 TFs relevant to forebrain development and disease. Our findings revealed multiple distinct patterns of combinatorial TF binding, with the HES5-FOXP2-GATA3 triad being the most frequent in forebrain-specific enhancers. This syntactic structure was detected in 2614 enhancers from a genome-wide catalog of 25,000 predicted human forebrain enhancers. Notably, this catalog represents a computationally predicted dataset, distinct from the in vivo validated set of enhancers obtained from the VISTA Enhancer Browser. The shortlisted 2614 enhancers were further analyzed using genome-wide epigenetic data and evaluated for evolutionary conservation and disease relevance. Our findings highlight the value of these 2614 enhancers in forebrain-specific gene regulation and provide a framework for discovering tissue-specific enhancers, enhancing the understanding of enhancer function.