Abstract
Objective: Tumor progression is regulated by systemic immune status, nutritional metabolism, and the inflammatory microenvironment. This study aims to investigate inflammatory-nutritional biomarkers associated with metachronous liver metastasis (MLM) in colorectal cancer (CRC) and develop a machine learning model for accurate prediction. Methods: This study enrolled 680 patients with CRC who underwent curative resection, randomly allocated into a training set (n = 477) and a validation set (n = 203) in a 7:3 ratio. Feature selection was performed using Boruta and Lasso algorithms, identifying nine core prognostic factors through variable intersection. Seven machine learning (ML) models were constructed using the training set, with the optimal predictive model selected based on comprehensive evaluation metrics. An interactive visualization tool was developed to interpret the dynamic impact of key features on individual predictions. The partial dependence plots (PDPs) revealed a potential dose-response relationship between inflammatory-nutritional markers and MLM risk. Results: Among 680 patients with CRC, the cumulative incidence of MLM at 6 months postoperatively was 39.1%. Multimodal feature selection identified nine key predictors, including the N stage, vascular invasion, carcinoembryonic antigen (CEA), systemic immune-inflammation index (SII), albumin-bilirubin index (ALBI), differentiation grade, prognostic nutritional index (PNI), fatty liver, and T stage. The gradient boosting machine (GBM) demonstrated the best overall performance (AUROC: 0.916, sensitivity: 0.772, specificity: 0.871). The generalized additive model (GAM)-fitted SHAP analysis established, for the first time, risk thresholds for four continuous variables (CEA > 8.14 μg/L, PNI < 44.46, SII > 856.36, ALBI > -2.67), confirming their significant association with MLM development. Conclusions: This study developed a GBM model incorporating inflammatory-nutritional biomarkers and clinical features to accurately predict MLM in colorectal cancer. Integrated with dynamic visualization tools, the model enables real-time risk stratification via a freely accessible web calculator, guiding individualized surveillance planning and optimizing clinical decision-making for precision postoperative care.