Abstract
Colorectal adenomas (CRA) represent critical precursors to colorectal cancer (CRC), yet reliable transcriptomic biomarkers for early detection and therapeutic targeting remain limited. Integration of gut microbiota (GM) genetics with transcriptomics offers a novel approach to identify disease-associated molecular signatures. We sought to identify GM-associated molecular signatures that could serve as early intervention targets. We integrated transcriptomic data with Mendelian randomization (MR) analysis to establish causal relationships between GM and CRA development. Machine learning algorithms identified robust biomarkers, which we validated through expression analysis and receiver operating characteristic (ROC) analysis to construct predictive nomogram models. Comprehensive molecular characterization included Gene Set Enrichment Analysis (GSEA), immune profiling, and regulatory network analysis. Single-cell RNA sequencing (scRNA-seq) analysis further validated biomarker expression patterns across distinct cell populations in the tumor microenvironment. We discovered 12 GM species with significant causal relationships to CRA risk. Two biomarkers, TMOD2 and DOCK4, emerged as powerful predictive indicators with strong correlation (r = 0.66, p < 0.001). These biomarkers demonstrated excellent diagnostic performance in ROC analysis and revealed previously unrecognized connections to cell adhesion pathways critical for adenoma progression. Single-cell analysis revealed TMOD2 expression across multiple cell clusters with notable exclusion in mast cells, while DOCK4 expression was predominantly restricted to fibroblasts, myeloid, and epithelial cells. Notably, we identified distinct immune cell infiltration patterns, including altered naive B cells and macrophage populations, suggesting immune dysregulation as a key mechanism. GSEA revealed enrichment in cell adhesion molecule (CAM) pathways. Regulatory network analysis uncovered complex control by 18 microRNAs (miRNAs), 40 long noncoding RNAs (lncRNAs), and 10 transcription factors (TFs), with EIF3A emerging as a key m6A reader protein. Drug screening identified 22 potential therapeutic compounds, with trichostatin A showing optimal binding affinity. These findings establish TMOD2 and DOCK4 as novel biomarkers linking GM dysbiosis to CRA development, opening new avenues for microbiome-targeted early intervention strategies.