Abstract
Chronic kidney disease (CKD) is a growing public health problem worldwide. CKD not only leads to renal function decline but also increases the risk of multiple complications, sarcopenia being particularly common and severe. At present, there is a lack of prediction tools specifically for the risk of sarcopenia in CKD patients. Based on the China Health and Retirement Longitudinal Study (CHARLS) data, this study aims to establish a prediction model (Nomogram) for the risk of sarcopenia in Chinese CKD patients to identify high-risk patients early and implement interventions. This study was based on the CHARLS 2015 data, which included 731 CKD patients. After oversampling, the total sample size increased to 1,000, with 70% allocated to the training set and 30% to the internal validation set. To mitigate potential overfitting caused by oversampling, the CHARLS 2011 data was also utilized as an external validation set to verify the model's external validity. The study used the least absolute shrinkage and selection operator (LASSO) regression for feature screening to identify key predictors associated with sarcopenia. Then, a multivariate logistic regression model was used to determine the final independent predictors, and a risk prediction line graph for sarcopenia was constructed based on these eight selected factors. Model performance was assessed by the area under the receiver operating characteristic curve (AUC) to identify the discriminative ability, the calibration curve to assess the calibration of the model, and decision curve analysis (DCA) to measure the clinical application value of the model. In addition, the relative importance of each predictor was analyzed using a machine learning method to further improve the stability and reliability of the results. This study included 731 patients with chronic kidney disease, of whom 151 were diagnosed with sarcopenia, with an incidence rate of 20.7%. LASSO regression analysis screened out 8 predictors (age, gender, history of cancer, smoking, waist circumference, sleep, white blood cell count, and pain), and a linear regression model was constructed. The ROC curve and AUC value of the model were 0.867 (95% CI: 0.841-0.893) for the training set, 0.828 (95% CI: 0.78-0.875) for the validation set, and 0.818 (95% CI: 0.755-0.881) for the external validation set, indicating high accuracy. The Hosmer-Lemeshow test results showed that the model predictions were consistent with the actual situation (P > 0.05). Calibration curve and decision curve analysis (DCA) further demonstrated that the model had good predictive performance and significantly improved the net benefit of clinical decision-making. In addition, the five machine learning models (GBM, RF, SVM, etc.) also showed high AUC values, especially GBM and RF, which verified their application potential in the prediction of sarcopenia risk in patients with chronic kidney disease. The Nomogram constructed in this study provides an effective tool for assessing the risk of sarcopenia in patients with chronic kidney disease, which can help clinicians screen high-risk patients and promote early intervention and personalized treatment.