Determining the association of C-reactive-protein-triglyceride-glucose index and diabetes using machine learning and LASSO regression: A cross-sectional analysis of NHANES 2001 to 2010 results

利用机器学习和LASSO回归分析C反应蛋白-甘油三酯-葡萄糖指数与糖尿病的关联性:基于2001年至2010年NHANES结果的横断面分析

阅读:1

Abstract

The C-reactive protein-triglyceride-glucose index (CTI) has emerged as a novel metric for evaluating the severity of inflammation and the degree of insulin resistance. Nevertheless, the precise correlation between CTI and diabetes remains to be elucidated. Consequently, in this study, we elucidate the relationship between CTI and diabetes. The study utilized data from the National Health and Nutrition Examination Survey spanning from 2001 to 2010. To evaluate the association between CTI and the risk of diabetes, the research employed weighted logistic regression, subgroup analyses, and restricted cubic spline. Subsequently, participants were randomly assigned to the training and validation cohorts in a 7:3 ratio. Least Absolute Shrinkage and Selection Operator (LASSO) regression was employed to evaluate the validation cohort, select the optimal model, and identify potential confounding factors. The variables identified by LASSO regression were used to construct a nomogram-based predictive model, receiver operating characteristic curve, calibration curve, and decision curve analysis curve. The variables selected by LASSO regression were also incorporated into the ML model, and SHAP visualization analysis was performed. Upon adjustment for potential confounding factors, a significant positive correlation was observed between the CTI and the incidence of diabetes (OR = 1.96, 95% CI: 1.69-2.26, P < .001). Restricted cubic spline showed a linear positive correlation between CTI and incidence of diabetes mellitus (P-nonlinear = .5200). A total of 8 variables were identified through LASSO regression, including age, race, marital status, hypertension, body mass index, cardiovascular disease (CVD), and CTI. A nomogram-based predictive model was constructed using these predictors. The area under the receiver operating characteristic curve (AUC) in the validation cohort was 0.92, indicating a robust performance of the model. These 8 variables were subsequently incorporated into the ML model, and the CatBoost model demonstrated the best performance with an AUC of 0.843 (95% CI: 0.820-0.866). SHAP analysis revealed that hypertension was the most influential factor. A significant positive linear correlation was observed between higher CTI values and increased diabetes risk, suggesting that CTI has the potential to serve as a predictor for the incidence risk of diabetes.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。