Abstract
PURPOSE: To develop and validate an explainable machine learning (ML) model to predict postoperative cholangitis (POC) in pediatric patients with pancreaticobiliary maljunction (PBM) using readily accessible clinical data. METHODS: We analyzed 337 children with PBM who underwent surgery, dividing them into training (n = 243, center I) and testing (n = 94, center II) sets. Six ML algorithms were applied, and the best-performing model was identified by area under the receiver operating characteristic curves (ROC-AUC) and precision-recall curves (PR-AUC). Model calibration, clinical applicability, and interpretability were further evaluated using calibration curves, decision curve analysis (DCA), and Shapley Additive Explanations (SHAP). RESULTS: After a median follow-up of 21.8 months, 13.2% (32/243) of patients from center I and 14.9% (14/94) from center II developed POC. The final random forest (RF) model exhibited the best performance, with ROC-AUC of 0.890 and PR-AUC of 0.764 in testing set, with good calibration across both sets. DCA confirmed that the final RF model was clinically useful. Nine key features were identified and ranked using SHAP analysis, with cholangial inflammatory infiltration and diameter of common bile duct being the most important. CONCLUSION: This explainable ML model could effectively predict POC, aiding clinicians in identifying high-risk patients and supporting individualized management in PBM. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-025-00491-4.