Abstract
BACKGROUND: Machine learning (ML) has been widely used to predict complications and prognosis in patients undergoing hemodialysis (HD). However, accurate and efficient models for predicting postdialysis fatigue (PDF) in this population are still needed because PDF is surprisingly prevalent. AIMS: This study aimed to explore the potential of ML models for predicting PDF in patients undergoing HD. DATA SOURCES: A total of 1,281 Chinese patients undergoing HD from six tertiary hospitals (65.26% male, mean age = 54.48 years). DESIGN: Cross-sectional study. METHODS: Seven ML models were compared: Logistic regression (LR), Decision tree (DT), Random forests (RF), LightGBM (LGBM), CatBoost, XGBoost (XGB), and Gradient boosting tree (GBT), to predict the PDF and identify variables with predictive value based on the best-performing model among Chinese patients undergoing HD. The study findings were reported in accordance with the TRIPOD+AI guidelines. RESULTS: The RF model achieved the relatively optimal and stable performance, with an area under the curve of 0.855, accuracy of 0.773, F1 score of 0.769, and Brier score of 0.155 in test set. Resilience, appetite, potassium levels, sleep quality, constipation, history of fistula surgery, diastolic blood bressure, and the category of "combined other diseases" were the strongest predictors of PDF. CONCLUSION: ML models can serve as convenient screening and assessment tools for PDF risk in Chinese patients undergoing HD. In combination with the SHapley Additive exPlanations (SHAP) approach, the proposed framework provides a more intuitive and comprehensive interpretation of the predictive model, thereby allowing clinicians to better understand the decision-making process of the model and impact of the factors associated with PDF.