Comparative evaluation of regression and machine-learning models for hepatocellular carcinoma risk stratification across diverse aetiologies

针对不同病因的肝细胞癌风险分层,对回归模型和机器学习模型进行比较评估

阅读:1

Abstract

BACKGROUND & AIMS: We aimed to develop machine learning (ML) models for hepatocellular carcinoma (HCC) risk stratification in patients with cirrhosis and to test their ability to identify those with an annual HCC incidence >3%, for whom more intensive surveillance may be justified. METHODS: Data from three prospective cohorts (ANRS CO12 CirVir, CO22 Hepather, APHP CIRRAL) were analyzed. All patients underwent semiannual ultrasound surveillance and were randomly split into training and validation sets. HCC incidence was evaluated using a competing risk framework. A single tree (ST) model was developed using conditional decision trees, while random forest (RF) models were built by aggregating 1,000 trees. A deep neural network (DNN)-based survival model was also applied. ML model performance was compared with established regression-based scores: aMAP (age-male-ALBI-platelets) and FASTRAK (FAST-MRI for HCC suRveillance in pAtients with high risK of liver cancer). RESULTS: Among 4,867 patients with non-viral cirrhosis or resolved/controlled viral cirrhosis, 294 (9.2%) developed HCC over a median follow-up of 61.5 months (annual incidence: 1.99%). The ST model identified four key predictors, generating five distinct risk groups. These included patients with mildly impaired liver function or those with elevated GGT and low platelet counts. The RF and DNN approaches confirmed ST findings and delineated complex interactions among predictors. Performance metrics (C-index, Brier score, decision curve analysis) showed no significant advantage of ML models over aMAP and FASTRAK. Calibration was consistent across models. ML models identified higher proportions of patients with an annual HCC incidence >3% (ST 44%; DNN 37%; RF 30%) compared with aMAP (36%) and FASTRAK (29%). CONCLUSIONS: ML-based algorithms did not outperform traditional risk scores but provided novel insights into variable interactions and helped identify clinically relevant patient subgroups with differing HCC risk profiles. IMPACT AND IMPLICATIONS: Accurate stratification of hepatocellular carcinoma risk in cirrhosis is essential to optimize surveillance strategies, and this study provides a scientific rationale for exploring machine learning approaches to capture complex, non-linear interactions among clinical variables beyond traditional regression models. Although machine learning did not improve predictive performance over established scores, it revealed clinically meaningful risk subgroups defined by liver function, platelet count, and GGT, underscoring its value as an interpretative and hypothesis-generating tool. These results are particularly relevant for hepatologists and clinical researchers seeking to refine risk-adapted surveillance and to inform the design of future models or trials.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。