Development and validation of machine learning models for distant metastasis of primary hepatic carcinoma: a population-based study

基于人群的机器学习模型在原发性肝癌远处转移预测中的应用与验证

阅读:2

Abstract

BACKGROUND: Primary liver cancer is the sixth most common cancer globally and ranks third in cancer-related mortality. Patients with distant metastasis (PLCDM) have particularly low survival rates and are more difficult to treat. This study aims to identify risk factors associated with distant metastasis and overall survival (OS) in primary liver cancer and to determine the optimal predictive models using machine learning. METHODS: We extracted data from the SEER database (Incidence-SEER Research Data, 17 Registries, Nov 2022 Sub (2000-2020)) and identified risk factors for distant metastasis using logistic regression. Eight machine learning models were constructed using the "tidymodels" package in R and evaluated based on ROC curves, AUC, and accuracy. Cox regression was used to identify risk factors for OS, and Cox and Random Survival Forest (RSF) models were compared using time-dependent ROC curves. The best-performing model was interpreted using Shapley analysis. We also developed user-friendly web applications using the "shiny" package in R for clinical use. RESULTS: Multivariate analysis identified grade, T stage, N stage, tumor size, and surgery as independent risk factors for PLCDM. The Random Forest (RF) model showed the best performance with AUC values of 0.836, 0.817, and 0.846 in the training, internal validation, and external validation cohorts, respectively, and favorable Brier scores and accuracy. Shapley analysis ranked the risk factors by contribution as surgery, T stage, tumor size, N stage, and grade. Cox regression identified grade, surgery, and T stage as independent prognostic factors for OS. The Cox model outperformed the RSF model in time-dependent ROC analysis. Calibration and decision curve analysis (DCA) further confirmed its strong predictive performance and clinical utility. Shapley analysis ranked the risk factors as grade, surgery, and T stage. CONCLUSIONS: We successfully constructed and validated optimal models for predicting PLCDM and its prognosis. These models provide valuable tools to guide clinical decision-making for PLCDM.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。