Abstract
Fecalith-associated appendicitis presents unique challenges in conservative management due to increased perforation risk. Early identification of patients at high risk for appendiceal perforation during conservative treatment remains crucial for optimal clinical decision-making. To develop and validate machine learning-based prediction models for appendiceal perforation risk assessment during conservative treatment of fecalith-associated appendicitis. This retrospective cohort study analyzed 1247 patients with fecalith-associated appendicitis who underwent initial conservative treatment across four tertiary care centers between January 2018 and December 2023. The study design and reporting adhere to the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis) guidelines for model development and external validation. Twenty machine learning algorithms were systematically trained and validated using clinical, laboratory, and imaging parameters. LASSO regularization identified eight optimal predictive features from 34 candidate variables. The final nominated model for clinical deployment is the Gradient Boosting classifier, trained on all eight LASSO-selected features. Primary outcome was appendiceal perforation within 72 h of conservative treatment initiation. Of 1247 patients, 186 (14.9%) developed appendiceal perforation during conservative treatment. The ensemble gradient boosting model achieved the highest performance with an AUC of 0.892 (95% CI 0.871-0.913), sensitivity of 84.4% (95% CI 79.2-89.6%), and specificity of 81.7% (95% CI 77.8-85.6%). External validation in an independent cohort (n = 225; The People's Hospital of Sishui, January 2023-December 2024) confirmed model generalisability: AUC = 0.909 (95% CI 0.859-0.951), sensitivity = 73.7%, specificity = 93.0%, PPV = 68.3%, and NPV = 94.6%. SHAP analysis identified key predictive features: fecalith size (importance: 0.234), C-reactive protein (0.186), white blood cell count (0.162), appendiceal wall thickness (0.143), and patient age (0.121). Risk stratification classified patients into low-risk (58.9%, 3.8% perforation rate), moderate-risk (31.9%, 24.6% perforation rate), and high-risk (9.2%, 71.3% perforation rate) categories. Decision curve analysis demonstrated significant clinical utility with net benefit of 0.08 at 15% threshold probability. Machine learning models, particularly ensemble gradient boosting methods, demonstrate excellent accuracy in predicting appendiceal perforation risk during conservative treatment of fecalith-associated appendicitis, with performance confirmed in an external validation cohort. These validated models provide clinically actionable risk stratification that may assist in treatment decision-making and patient monitoring strategies, potentially preventing unnecessary surgeries while identifying high-risk patients requiring enhanced surveillance or early surgical intervention.