Abstract
OBJECTIVE: This study aims to construct a nomogram using habitat radiomics, radiomics, and clinical features derived from multiparametric MRI (mpMRI) to improve the accuracy of preoperative prediction of lymphovascular invasion (LVI) status in rectal cancer patients. METHODS: Data from 372 pathologically confirmed rectal cancer cases were retrospectively collected from two centers. Data from Center 1 were randomly split in a 7:3 ratio into a training cohort (n = 201) and an internal validation cohort (n = 87). Data from Center 2 served as external validation (n = 84). K-means clustering was used to divide the tumor into three subregions. Radiomics features were extracted from regions of interest and selected, and three machine learning algorithms were employed to construct radiomics models. A nomogram was created by integrating clinical, radiomics, and habitat radiomics features. The model’s predictive accuracy was assessed using AUC metrics, while practical clinical applicability was evaluated via calibration plots and decision curve analysis. SHAP values were employed to measure the contribution of individual radiomic features to predictive outcomes, thereby offering transparent insights for clinical decision-making processes. RESULTS: The nomogram model outperformed other single models in predicting LVI, with AUCs of 0.978, 0.909, and 0.889 in the training, validation, and external test sets, respectively. Compared with the intratumoral model alone, the nomogram model achieved improvements of 20.3%, 14.1%, and 13.8%, respectively. CONCLUSION: The nomogram model developed in this study significantly improved the accuracy of predicting preoperative lymphovascular invasion status in rectal cancer. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12880-025-02105-1.