Abstract
INTRODUCTION: Institutional delivery dropout (IDD), defined as delivery outside a health facility despite attending antenatal care (ANC), remains a significant barrier to reducing maternal mortality in Nigeria. Traditional statistical models often fall short of capturing the complex, non-linear interactions among the socio-demographic factors that drive this critical health behavior. METHODS: Using a comprehensive dataset of 16,100 women from the 2018 Nigeria Demographic and Health Survey (NDHS), we applied and compared seven diverse machine learning (ML) algorithms, including models such as Support Vector Machine (SVM), Gradient Boosting (GB), and Extreme Gradient Boosting (XGBoost). The model performance was systematically evaluated using metrics such as accuracy, Area Under the Receiver Operating Characteristic curve (AUROC), F1-score, and detailed confusion matrices. Furthermore, SHapley Additive explanations (SHAP) were used to provide transparent interpretations of feature importance and predictive contributions. RESULTS: Gradient Boosting was the best-performing model, achieving the highest F1-score (0.755) and AUROC (0.82). SVM achieved the highest accuracy (0.740) and recall (0.780). SHAP identified education level, household wealth, and religion as strong predictors of IDD. The performance metrics reported with confidence intervals showed modest variability across the models. CONCLUSION: Machine learning approaches were effective in identifying women at an increased risk of institutional delivery dropout. SHAP analysis provides insights into the key sociodemographic predictors of IDD, highlighting the value of interpretable ML methods in maternal health research.