Abstract
OBJECTIVE: This study aimed to develop and validate a predictive model for postoperative complications in gastrointestinal cancer patients using a large multicenter database, based on machine learning algorithms. METHODS: We analyzed the clinicopathological data of 3,926 gastrointestinal cancer patients from the Prevalence of Abdominal Complications After GastroEnterological surgery (PACAGE) database, covering 20 medical centers from December 2018 to December 2020. The predictive performance was evaluated using receiver operating characteristic (ROC) curves and Brier Score. RESULTS: The patients were divided into gastric (2,271 cases) and colorectal cancer (1,655 cases) groups and further divided into training and external validation sets. The overall postoperative complication rates for gastric and colorectal cancer groups were 18.1% and 14.8%, respectively. The most common complication was the intra-abdominal infection in both gastric and colorectal cancer groups. In the training set, the Random Forest (RF) model predicted the highest mean area under the curve (AUC) values for overall complications and different types of complications, in both the gastric cancer group and the colorectal cancer group, with similar results obtained in the external validation set. ROC curve analysis showed good predictive performance of the RF model for overall and infectious complications. An application-based clinical tool was developed for easy application in clinical practice. CONCLUSIONS: This model demonstrated good predictive performance for overall and infectious complications based on the multi-center database, supporting clinical decision-making and personalized treatment strategies.