Supervised machine learning model to predict total knee replacement in a large osteoartrhitis real-world evidence dataset using retrospectively 20-year insurance data from Israel

利用以色列20年的回顾性保险数据,构建监督式机器学习模型,预测大型骨关节炎真实世界证据数据集中的全膝关节置换术需求。

阅读:2

Abstract

BACKGROUND: Osteoarthritis (OA) affects over 500 million patients worldwide, significantly impacting their quality of life and independence. The incidence of OA is expected to rise due to aging populations and increasing obesity rates. Traditionally considered a degenerative, age-related disease, OA is now recognized as heterogeneous with different phenotypes. This study aimed to evaluate the use of real-world evidence (RWE) data for phenotyping OA and developing a predictive model for total knee replacement (TKR) using insurance data from Israel. METHODS: The study utilized anonymized electronic health record data from Maccabi Healthcare Services, covering over 26% of the Israeli population. Patients diagnosed with knee OA 2000–2023, were included. Data on medical history, socioeconomic status, and lifestyle factors were extracted from the database. The outcome was the time to the first TKR, analyzed using a regularized Lasso Cox regression model. The model’s performance was assessed using the area under the receiver operating characteristic curve and C-index metrics. RESULTS: A total of 135,691 patients met the inclusion criteria. The population was divided into a training (80%) and test set (20%) for model development and performance assessment. Baseline data were comparable between the training and test sets. In total, 3230 (2.4%) TKR events were observed with an estimated rate of 2% (95% CI: 1.9–2.1) at 2 years. Out of 62 initial variables the most predictive ones for TKR were age, presence of allergy and BMI. The final model categorized patients into low- (75%) and high-risk (25%) groups based on a predicted risk score with 4% (95%CI: 3.8–4.2) of TKR events by 2 years in high-risk vs. 1.4% (95%CI: 1.2–1.6) in the low-risk group. CONCLUSION: The study demonstrated the feasibility of using RWE data for risk phenotyping in OA and predicting TKR. More granular phenotyping reflecting OA heterogeneity proved impossible. The predictive model, however, based on data typically available in clinical practice, could support shared decision-making, simplify feasibility assessments and enrichment strategies for research for large trials. The use of RWE data may better reflect the epidemiologic reality and reduce biases associated with clinical trials and registries that otherwise form the basis of trial planning. CLINICAL TRIAL NUMBER: Not applicable. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-026-03435-y.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。