Abstract
OBJECTIVE: Compare the performance of the Multivariable logistic regression (LR) model based on traditional statistical methods and the Random Forest (RF) model in machine learning for predicting clinically relevant postoperative pancreatic fistula (CR-POPF) after pancreatoduodenectomy (PD). BACKGROUND: CR-POPF is a common and severe complication following PD. Traditional statistical models are widely used to predict it, but the rise of machine learning has garnered attention for its potential in predictive medicine. Comparing the performance of traditional statistical methods and machine learning models provides insight into the optimal approach for CR-POPF prediction. METHODS: Clinical data from patients undergoing PD were collected. CR-POPF prediction models were developed using Multivariable LR and RF, and their predictive performance was compared using Calibration curves, ROC curves and DCA curves. RESULTS: In the calibration curve analysis, the Multivariable LR model shows better calibration than the RF. The Multivariable LR model achieved an AUC of 0.96, while the RF model achieved an AUC of 0.90, indicating superior predictive accuracy of the Multivariable LR model. Decision curve analysis demonstrated that the Multivariable LR model provided higher net benefit across most threshold ranges than the RF model. CONCLUSION: The Multivariable LR model outperformed the RF model in predicting CR-POPF after PD and can be considered the preferred method for CR-POPF risk assessment.