Abstract
BACKGROUND: Hypertensive disorders in pregnancy (HDP) include gestational hypertension, preeclampsia, and eclampsia. Not all cases of gestational hypertension or mild preeclampsia progress to severe conditions. However, once they develop into severe preeclampsia (SPE), the risks to both the mother and the fetus increase significantly. We aimed to establish a nomogram and train a machine learning (ML) model that could identify SPE, early in the course of HDP. METHODS: In this retrospective study, 593 patients with HDP were enrolled in the training cohort. For predicting SPE early, six supervised ML models were employed, such as XGBoost, K-nearest neighbors (KNN), random forest (RF), LightGBM (LGBM), Support Vector Machines (SVM), and Decision Trees (DT), which were evaluated by accuracy (ACC) and the areas under the receiver operating characteristic curve (AUC). The nomogram was established, and the predictive ability was assessed by AUC, the calibration curve and clinical decision curves (DCA). They were validated by a validation cohort of 255 patients with HDP. RESULTS: The nomogram model achieved an AUC of 0.934 in the training cohort, with a calibration curve Brier score of 0.083 and a clinical applicability probability threshold of 5%-95%. In the validation cohort, it showed an AUC of 0.882, a calibration curve Brier score of 0.115, and a clinical applicability probability threshold of 10%-95%. In the validation cohort, the AUC of XGBoost, KNN, RF, LGBM, SVM, DT, and multivariate logistic regression analysis models were 0.876, 0.822, 0.866, 0.866, 0.871, 0.784, and 0.847, the XGBoost model showed the highest AUC. CONCLUSIONS: This study demonstrates that a family history of hypertension, urine protein, umbilical artery S/D ratio, WBC, TBIL, UA, LDL, TG, CRP, and blood Ca are predictors of HDP progression to SPE. A nomogram model for predicting the progression of HDP to SPE was constructed using these predictors. The model exhibited good discrimination, calibration, and clinical utility in both the training and validation cohorts. Additionally, a ML model was developed, with the XGBoost model identified as the optimal one, which can be applied clinically in conjunction with the nomogram prediction model.