Abstract
This study developed a machine learning model to predict stillbirth using retrospective data from 32,953 singleton pregnancies at multi-centers in South Korea. Variables were collected at baseline, E1 (before 13 weeks of pregnancy), and T0 (before 28 weeks of pregnancy). Each separate cohort (all stillbirths, early stillbirths, late stillbirths) was randomly divided into training and test sets at a 7:3 ratio. Extreme Gradient Boosting Machine algorithm was used to make original models with a full set of variables and simplified models with certain variables from Shapley additive explanations (SHAP) values. The prediction model of whole cohort for all stillbirths achieved an area under the receiver operating characteristic curve (AUC) value of 0.720 (baseline) and 0.740 (E1) with an area under precision-recall curve (AUPR) value of 0.016 (baseline) and 0.019 (E1). The prediction models for early and late stillbirths achieved similar results. For the original model for late stillbirth at T0 achieved AUC of 0.781 and AUPR of 0.015. A simplified model for late stillbirth at T0 using top-ranked SHAP variables demonstrated similar performance with AUC of 0.759 and AUPR of 0.133. Our machine learning model for predicting the late stillbirth in East Asian women in singleton pregnancies with variables before 28 gestational weeks might be helpful for evaluating individual risks.