Abstract
BACKGROUND: Preeclampsia (PE) is a pregnancy-specific syndrome with unclear pathogenesis. Emerging evidence suggests that ferroptosis-related molecular pathways may be associated with oxidative stress and placental dysfunction observed in PE. In this study, we developed a PE classification model based on ferroptosis-related genes using machine learning and identified potential biomarkers. METHODS: We downloaded bulk and single-cell transcriptomic data of PE and normal placental tissues from the Gene Expression Omnibus (GEO) database. Ferroptosis-related genes were identified using weighted gene co-expression network analysis (WGCNA), differential expression analysis, and LASSO regression. Based on these genes, we developed a machine learning model to predict PE. Key marker genes were further selected using the random forest algorithm and validated in single-cell transcriptomic data. RESULTS: A total of 27 ferroptosis-related genes associated with PE were identified by overlapping 1,434 differentially expressed genes (DEGs) in PE, 268 known ferroptosis-related genes, and 1,151 PE-related genes. LASSO regression further selected 11 key genes with potential predictive value. Among them, BACH1 was significantly upregulated in PE and showed the strongest predictive performance as a single gene in the blood sample dataset (GSE48424). The machine learning classification model based on the selected genes exhibited strong discriminative performance, with the random forest classifier achieving an AUC of 0.87 in the test dataset (GSE75010). In the independent validation datasets (GSE149437, GSE25906, and GSE48424), the AUC values were 0.7, 0.78, and 0.81. Validation using the single-cell dataset GSE173193 confirmed that BACH1 was predominantly expressed in neutrophils from PE patients. CONCLUSIONS: This study established a transcriptomic classification model for preeclampsia based on ferroptosis-related genes and identified BACH1 as a ferroptosis-related gene associated with preeclampsia, with potential as a blood-based biomarker pending prospective validation. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12884-026-08884-x.