Abstract
BACKGROUND: Surgical site infections (SSIs) cause substantial postoperative morbidity in children. Despite being largely preventable, SSI rates have continued to rise. Existing SSI predictive models are predominantly designed for adults. This study aimed to develop, validate, and compare machine learning models for predicting pediatric SSI risk and recommend a model for clinical workflow integration. STUDY DESIGN: We conducted a retrospective cohort analysis of 1,152,034 cases from the NSQIP-Pediatrics (NSQIP-P) database (2012 to 2022), including 27,562 SSIs. Data were split into training (1,101,950) and test (50,084) sets. We developed 5 models: elastic net-regularized logistic regression, random forest, gradient boosted trees, k-nearest neighbors, and neural networks. Performance was evaluated using Brier scores, c-statistics, and calibration metrics, with bootstrap resampling to estimate confidence intervals. RESULTS: All models performed similarly (Brier scores 0.023 to 0.024; c-statistics 0.72 to 0.77). Regularized logistic regression showed strong performance (Brier score 0.023; c-statistic 0.77) and was selected for its balance of predictive accuracy, computational efficiency, and feasibility for clinical use. Key predictors included procedural codes, perioperative diagnoses, comorbidities, acuity markers, labs, and patient demographics. In the test set, mean predicted SSI risk was 2.4%; 62% of patients had risk <2%, whereas 3% had risk ≥10%. CONCLUSIONS: Machine learning models can predict pediatric SSI risk using preoperative data. Our regularized logistic regression model is a promising candidate for integration into electronic health record systems to enable individualized risk estimation and targeted infection prevention. Next steps include external validation and user-centered implementation studies.