Abstract
This study aimed to develop a machine learning model for accurately predicting postoperative VTE risk in these patients. The data of this study retrospectively collected the inpatients of cervical cancer in the Affiliated Cancer Hospital of Chongqing University from January 2020 to December 2023. We utilized 1,044 observations and six variables to develop seven machine learning (ML) models and selected the best-performing model for assessing postoperative venous thromboembolism (VTE) risk. The models were evaluated using ROC, PR, and DCA curves, and the prediction of the model's process was explained using SHAP values. Among 1,044 postoperative cervical cancer patients, 82 (7.85%) developed VTE. Seven machine learning algorithms were developed and evaluated, with the random forest (RF) model showing the best overall performance (AUC = 0.852, AUPR = 0.332). Compared with other models, including logistic regression (AUC = 0.767) and XGBoost (AUC = 0.836), the RF model demonstrated superior discrimination, calibration, and generalizability. Decision curve analysis confirmed that the RF model yielded the highest net clinical benefit across a wide range of threshold probabilities (5-80%). SHAP analysis revealed that D-dimer, neutrophil-to-lymphocyte ratio (NLR), and age were the most influential variables, indicating their substantial contributions to model prediction. Our study suggests that machine learning algorithms can be practical tools for postoperative VTE risk assessment and can learn from patient characteristics to provide personalized evaluations. The RF model outperformed the other six algorithms evaluated. We developed the final RF model as a user-friendly web tool for healthcare professionals.