Abstract
BACKGROUND: Endometriosis (EM), a prevalent gynecological disorder in reproductive-age women, lacks reliable noninvasive diagnostic tools. EM may be detected by neutrophil extracellular traps (NETs), which are essential to inflammation and immunological regulation. This research utilized 13 machine learning algorithms to improve the predictive precision of the diagnostic model and pinpointed four potential biomarkers that could aid in the diagnosis of EM. METHODS: Using the GSE141549 dataset, we combined differentially expressed genes with NETs-related markers to identify key genes linked to EM. We performed functional analysis to understand their biological roles. Through 13 machine learning methods, we built 107 different models and selected four central genes: CEACAM1, FOS, PLA2G2A, and THBS1. We developed a diagnostic model, evaluating its performance with ROC curves, calibration plots, and decision curve analysis; its robustness was rigorously assessed through 10-fold cross-validation. RESULTS: The four-gene model demonstrated superior performance, with high accuracy in the training set (AUC: 0.962), robust generalizability in cross-validation (mean AUC: 0.975) and external cohorts, confirmed clinical utility, and consistent gene expression across multiple datasets. CONCLUSION: This study identifies CEACAM1, FOS, PLA2G2A, and THBS1 as promising biomarkers for endometriosis. Their association with immune infiltration may improve early detection strategies, highlighting the potential for developing non-invasive diagnostic tools for EM in the future.