Abstract
PURPOSE: This study aimed to construct and compare machine learning models for predicting recurrent extrahepatic bile duct stones after common bile duct exploration and to clarify the contribution of key risk factors using SHAP analysis, thereby providing a reliable tool for clinical risk assessment and intervention. METHODS: Retrospective analysis of 1,363 patients (2010-2024, Huangshi Central Hospital/Honghu People's Hospital) with extrahepatic bile duct stones (156 recurrent cases). LASSO regression selected 8 predictors; 9 machine learning models were built, evaluated by AUC, accuracy, etc., and SHAP interpreted the optimal model. RESULTS: Random Forest (RF) performed best: training/validation/external cohort AUC 97.99%/93.66%/83.1%, accuracy 0.953/0.902/0.829. SHAP identified maximum stone diameter, common bile duct diameter, and direct bilirubin as top risks, with nonlinearity (stones >15 mm elevated risk) and synergistic interactions. CONCLUSION: Random Forest (RF) is confirmed as the most reliable tool for predicting recurrent extrahepatic bile duct stones post-common bile duct exploration, outperforming other models in generalization. SHAP analysis clarifies that max stone diameter, CBD diameter, and direct bilirubin (with nonlinear effects like stones >15 mm elevating risk) are key synergistic risks. This study enables personalized clinical risk assessment and targeted interventions to reduce postoperative recurrence.