Development and validation of deep learning- and ensemble learning-based biological ages in the NHANES study

在NHANES研究中开发和验证基于深度学习和集成学习的生物年龄

阅读:1

Abstract

INTRODUCTION: Conventional machine learning (ML) approaches for constructing biological age (BA) have predominantly relied on blood-based markers, limiting their scope. This study aims to develop and validate novel ML-based BA models using a comprehensive set of clinical, behavioral, and socioeconomic factors and evaluate their predictive performance for mortality. METHODS: We analyzed data from 24,985 participants in the National Health and Nutrition Examination Survey (NHANES) from 1999 to 2010, with follow-up extending to 31 December 2019, or until death or loss to follow-up. Thirty features, including blood and urine biochemistry, physical examination data, behavioral traits, and socioeconomic factors, were selected using the Least Absolute Shrinkage and Selection Operator (LASSO). These features were utilized to train deep neural networks (DNN) and ensemble learning models, specifically the Deep Biological Age (DBA) and Ensemble Biological Age (EnBA), with chronological age (CA) as the reference label. Model performance was assessed using mean absolute error (MAE), while interpretability was explored using Shapley Additive exPlanation (SHAP). Predictive accuracy of DBA and EnBA for mortality was compared with Phenotypic Age (PhenoAge) using the area under the curve (AUC) derived from Cox proportional hazards models and hazard ratios (HR), adjusted for demographics and lifestyle factors. Sensitivity analyses were performed to ensure robustness. RESULTS: DBA and EnBA accurately predicted actual age (MAE = 2.98 and 3.58 years, respectively) and demonstrated strong predictive capability for all-cause mortality, with AUCs of 0.896 (95% CI: 0.891-0.898) for DBA and 0.889 (95% CI: 0.884-0.894) for EnBA. Higher DBA and EnBA accelerations were significantly associated with increased mortality risk (HR = 1.059 and 1.039, respectively). SHAP analysis highlighted prescription medication usage, hepatitis B surface antibody status, and vigorous physical activity as the most influential features contributing to DBA predictions. Furthermore, BA acceleration was linked to elevated risk of death from specific chronic conditions, including cardiovascular and cerebrovascular diseases and cancer. DISCUSSION: Our study successfully developed and validated two ML-based BA models capable of accurately predicting both all-cause and cause-specific mortality. These findings suggest that the DBA and EnBA models hold promise for early identification of high-risk individuals, potentially facilitating timely preventive interventions and improving population health outcomes.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。