Development and validation of an interpretable machine learning model for predicting low muscle mass in patients with rheumatoid arthritis: a multicenter study

开发和验证用于预测类风湿性关节炎患者低肌肉量的可解释机器学习模型:一项多中心研究

阅读:1

Abstract

BACKGROUND: This study aims to develop a predictive model for identifying rheumatoid arthritis (RA) patients at risk of low muscle mass using easily obtainable clinical indicators. The goal is to facilitate targeted screening for individuals at high risk of sarcopenia, optimize diagnostic strategies, reduce the burden of additional testing, and improve the efficiency of early identification and intervention. METHODS: This study analyzed data from 1,260 RA patients obtained from the National Health and Nutrition Examination Survey (NHANES) database and the Affiliated Hospital of Shandong University of Traditional Chinese Medicine (SHUTCM). Eight machine learning models were developed, including Random Forest, LightGBM, XGBoost, CatBoost, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Logistic Regression, and a weighted ensemble model. Model performance was evaluated using metrics such as accuracy, area under the receiver operating characteristic curve (AUC), F1 score, Precision, Recall, and Brier score loss. The SHapley Additive exPlanation (SHAP) method was used to rank feature importance and interpret the final model. RESULTS: Among all machine learning models, the tree-based weighted ensemble model demonstrated the best performance, achieving an AUC of 0.921, outperforming all individual models. The model exhibited good calibration and higher net clinical benefit in decision curve analysis, especially within the probability threshold range of 0.2 to 0.8, and achieved an AUC of 0.848 on the test set, demonstrating a certain degree of generalizability. SHAP analysis identified BMI, albumin, hemoglobin, age, and creatinine as the most important features for predicting the risk of low muscle mass. SHAP dependency and waterfall plots further showed the model's decision-making mechanisms. Finally, we developed an online risk prediction calculator based on the FastAPI framework, which automatically generates individualized low muscle mass risk scores based on user input. The tool has been deployed on the Hugging Face platform and is accessible online. CONCLUSION: Based on a large, multicenter dataset, we developed and validated an explainable ML model capable of identifying individuals with a high risk of low muscle mass among patients with rheumatoid arthritis. This model may serve as a decision-support tool for clinicians in guiding further screening and diagnosis of sarcopenia.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。