Abstract
Increasing evidence underscores the driving role of coding and non-coding variants in cancer development. Analyzing gene sets in biological processes offers deeper insights into the molecular mechanisms of carcinogenesis. Here, we developed geMER to identify candidate driver genes genome-wide by detecting mutation enrichment regions within coding and non-coding elements. We subsequently designed a pipeline to identify a core driver gene set (CDGS) that broadly promotes carcinogenesis across multiple cancers. CDGS comprising 25 genes for 25 cancers displayed instability in DNA aberrations. Variants within the TTN enrichment region may influence the folding of the I-set domain by altering local polarity or side-chain chemistry properties of amino acids, potentially disrupting its antigen-binding capacity in LUAD. Multi-omics analysis revealed that APOB emerged as a candidate oncogene in LIHC, whose genetic alterations within the enrichment region may activate key TFs, upregulate DNA methylation levels, modulate critical histone modifications, and enhance transcriptional activity in the HepG2 and A549 cell lines compared to Panc1. Additionally, CDGS mutation status was an independent prognostic factor for the pan-cancer cohort. High-risk patients tended to develop an immunosuppressive microenvironment and demonstrated a higher likelihood of responding to ICI therapy. Finally, we provided a user-friendly web interface to explore candidate driver genes using geMER ( http://bio-bigdata.hrbmu.edu.cn/geMER/ ).