Machine learning-based prediction models for renal impairment in Chinese adults with hyperuricaemia: risk factor analysis

基于机器学习的中国高尿酸血症成人肾功能损害预测模型:风险因素分析

阅读:1

Abstract

In hyperuricaemic populations, multiple factors may contribute to impaired renal function. This study aimed to establish a machine learning-based model to identify characteristic factors related to renal impairment in hyperuricaemic patients, determine dose‒response relationships, and facilitate early intervention strategies. Data were collected through the big data platform of Nanjing Hospital of Traditional Chinese Medicine, encompassing 2,705 patients with hyperuricaemia (1,577 with renal impairment, 828 without) from June 2019 to June 2022. After multiple imputations for missing values, the dataset was randomly split into training (70%) and validation (30%) sets. We employed three machine learning algorithms for feature selection: random forest (with 100 decision trees and an OOB error rate of 23.34%), LASSO regression (optimal lambda of -3.58), and XGBoost (learning rate of 0.3, maximum tree depth of 1, and 50 rounds of boosting). The intersection of features identified by these algorithms through Venn diagram analysis yielded four key predictors. A logistic regression model was subsequently constructed and evaluated for discrimination (AUC), calibration (Brier score), and clinical utility (DCA). Restricted cubic spline (RCS) curves were utilized to analyse the dose‒response relationships. The model, which incorporates age, cystatin C (Cys-C), uric acid (UA), and sex, demonstrated robust performance, with an AUC of 0.818 [95% CI (0.796-0.817)] in the training set and an AUC of 0.82 [95% CI (0.787-0.853)] in the validation set. Calibration tests yielded Brier scores of 0.160 and 0.158, respectively. Clinical decision curves revealed optimal prediction probability intervals of 6-99.02% and 7-93.14%. In the hyperuricaemic population, each 0.5 mg/L increase in Cys-C, 10-year increase in age, and 100 µmol/L increase in UA corresponded to increased risks of 13%, 81%, and 73%, respectively. RCS analysis revealed nonlinear relationships for Age and Cys-C and a linear relationship for UA, with sex-specific distribution patterns. The machine learning-based model incorporating these four indicators demonstrated excellent predictive performance for renal impairment in hyperuricaemic patients. These findings suggest that monitoring Cys-C and UA levels while considering age and sex differences is crucial for risk assessment and prevention strategies.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。