Abstract
BACKGROUND: This study aims to identify characteristic genes linked to vascular smooth muscle cells (VSMCs) and intracranial aneurysm (IA) formation and rupture, providing insights for early diagnosis, risk prediction, and treatment of IA. METHODS: We analyzed the GSE75436 and GSE13353 datasets from the GEO database, performing differential expression analysis with |log2FC|> 1 and adj p < 0.05. Gene Ontology (GO), KEGG, and Reactome pathway enrichment analyses were conducted. Using the GSE122897 dataset, Weighted Gene Co-expression Network Analysis (WGCNA) identified gene modules and hub genes associated with IA. Machine learning algorithms (LASSO, SVM-RFE, and Random Forest) were applied to select differentially expressed hub genes. The expression patterns of marker genes were explored using the GSE193533 single-cell RNA sequencing dataset. Lastly, we assessed the predictive value of these genes in IA formation and rupture by plotting Receiver Operating Characteristic (ROC) curves and calculating the Area Under the Curve (AUC), thereby validating their sensitivity and specificity in clinical diagnostics. RESULTS: We identified 1000 upregulated and 806 downregulated genes in IA compared to normal arteries, and 468 downregulated and 405 upregulated genes in ruptured IA tissue. WGCNA revealed key gene modules associated with IA. Machine learning identified genes such as COL5A2, CDH11, PLOD1, P3H4, PLIN2, and PLAUR. Single-cell analysis showed a phenotypic transition in VSMCs from contractile to synthetic, with these genes predominantly expressed in synthetic VSMCs. ROC analysis validated their excellent diagnostic performance for IA formation and rupture. CONCLUSION: COL5A2, CDH11, PLOD1, and P3H4 were identified as genes associated with IA formation, while PLIN2 and PLAUR were linked to IA rupture. These genes provide potential biomarkers for IA diagnosis and risk assessment. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12883-026-04720-z.