Machine learning prediction of osteoarthritis risk from volatile organic compound exposure using SHAP interpretation in US adults

利用SHAP解读,通过机器学习方法预测美国成年人挥发性有机化合物暴露导致的骨关节炎风险

阅读:1

Abstract

Exposure to volatile organic compounds (VOCs) is widespread and has been implicated in the pathogenesis of various chronic diseases. However, the specific relationship between VOC exposure and the risk of osteoarthritis (OA) remains poorly characterized. This study aimed to investigate the associations between a broad spectrum of VOC metabolites and OA risk, and to identify the most influential VOC metabolites. We analyzed data from the National Health and Nutrition Examination Survey (NHANES) 2011-2018, comprising 3683 US adults. OA status was self-reported. Exposure levels to 17 VOCs were assessed using their urinary metabolites. After data splitting (70% training, 30% testing), multiple machine learning models were trained and evaluated. The optimal model was interpreted using SHapley Additive exPlanations (SHAP) to identify key predictors and elucidate their dose-response relationships with OA risk. The Linear Discriminant Analysis (LDA) model demonstrated the best predictive performance (AUC = 0.755). SHAP interpretation revealed that besides age, specific VOC metabolites were among the top predictors of OA. N-Acetyl-S-(3,4-dihydroxybutyl)-l-cysteine (DHBMA, a metabolite of 1,3-butadiene) and N-Acetyl-S-(3-hydroxypropyl-2-methyl)-l-cysteine (HMPMA, a metabolite of crotonaldehyde) were identified as novel and significant risk factors. Further analysis delineated non-linear, dose-response relationships between these VOCs and OA risk. Subgroup analyses suggested that the associations were consistent across different demographics. In summary, this study developed a machine learning model based on VOC exposure that effectively predicts osteoarthritis risk. LDA model achieved robust performance, with SHAP interpretation identifying DHBMA and HMPMA as novel and significant risk factors, in addition to known demographic predictors. Subgroup analyses further confirmed the consistent and non-linear association of these VOC metabolites with OA across diverse populations. These findings underscore the value of integrating environmental exposure data into OA risk prediction and support its potential for targeted prevention strategies in high-risk groups.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。