Abstract
This study explores the relationship between inflammatory biomarkers and the risk of endometriosis, aiming to develop a predictive model using National Health and Nutrition Examination Survey (1999-2006) data. The dataset included 4,089 females with complete hematological inflammatory indicators and covariates. A machine-learning model was developed using Shapley Additive exPlanations (SHAP), incorporating variables such as age, race, monocyte percentage (MONO%), platelet count (PLT), body mass index (BMI), platelet-to-albumin ratio (PAR), neutrophil percentage-to-albumin ratio (NPAR), lymphocyte-to-monocyte ratio (LMR), and educational attainment. The Gradient Boosting (XGBoost) model outperformed the other models, demonstrating excellent classification ability. SHAP analysis identified age, race, MONO%, PLT, BMI, PAR, NPAR, LMR, and educational attainment as the most influential predictors. The model was refined into an interactive web-based tool for real-time risk predictions. These findings highlight the potential of combining demographic data with inflammatory biomarkers to improve early diagnosis of endometriosis, emphasizing their significance in risk modeling. The use of non-invasive and cost-effective routine blood tests may help reduce delays in early diagnosis. Although the XGBoost model demonstrated strong performance, further validation in larger and more diverse populations is warranted to confirm its generalizability and clinical utility.