Abstract
Growing epidemiological evidence suggests a bidirectional relationship between gastroesophageal reflux disease (GERD) and ischemic stroke (IS), yet the shared molecular mechanisms remain poorly characterized. This study aims to identify common biomarkers and elucidate the pathogenic links between GERD and IS using integrative bioinformatics and machine learning approaches. Transcriptomic datasets for GERD (GSE26886 and GSE39491) and IS (GSE22255 and GSE58294) were obtained from the Gene Expression Omnibus. Batch effects were corrected using ComBat, and shared differentially expressed genes were identified via limma. Functional enrichment analyses (gene ontology and Kyoto encyclopedia of genes and genomes) were performed to uncover involved pathways. Key hub genes were selected using 3 machine learning algorithms: least absolute shrinkage and selection operator, support vector machine with recursive feature elimination, and random forest. Diagnostic utility was assessed through receiver operating characteristic curve analysis. We identified 52 upregulated and 57 downregulated differentially expressed genes common to both diseases. Enriched pathways included IL-17 signaling, glycosphingolipid biosynthesis, and PI3K-Akt signaling. Machine learning integration revealed 9 hub genes (FAM46C, FUT4, ODC1, UQCRB, ID2, TSC22D1, IL17RB, AHR, and MGAT4B) with consistent dysregulation in GERD and IS. These genes demonstrated high diagnostic accuracy, with combined area under the curve values between 0.9 and 1.0 across validation cohorts. IL17RB and FUT4 were notably upregulated, suggesting roles in inflammatory and glycosylation pathways. Our findings reveal convergent molecular pathways and potential diagnostic biomarkers linking GERD and IS. The identified hub genes may serve as dual-purpose therapeutic targets aimed at mitigating shared inflammatory and vascular mechanisms. Further experimental validation is needed to confirm their clinical relevance.