A genetic programming approach to development of clinical prediction models: A case study in symptomatic cardiovascular disease

利用遗传编程方法开发临床预测模型:以症状性心血管疾病为例

阅读:1

Abstract

BACKGROUND: Genetic programming (GP) is an evolutionary computing methodology capable of identifying complex, non-linear patterns in large data sets. Despite the potential advantages of GP over more typical, frequentist statistical approach methods, its applications to survival analyses are rare, at best. The aim of this study was to determine the utility of GP for the automatic development of clinical prediction models. METHODS: We compared GP against the commonly used Cox regression technique in terms of the development and performance of a cardiovascular risk score using data from the SMART study, a prospective cohort study of patients with symptomatic cardiovascular disease. The composite endpoint was cardiovascular death, non-fatal stroke, and myocardial infarction. A total of 3,873 patients aged 19-82 years were enrolled in the study 1996-2006. The cohort was split 70:30 into derivation and validation sets. The derivation set was used for development of both GP and Cox regression models. These models were then used to predict the discrete hazards at t = 1, 3, and 5 years. The predictive ability of both models was evaluated in terms of their risk discrimination and calibration using the validation set. RESULTS: The discrimination of both models was comparable. At time points t = 1, 3, and 5 years the C-index was 0.59, 0.69, 0.64 and 0.66, 0.70, 0.70 for the GP and Cox regression models respectively. At the same time points, the calibration of both models, which was assessed using calibration plots and the generalization of the Hosmer-Lemeshow test statistic, was also comparable, but with the Cox model being better calibrated to the validation data. CONCLUSION: Using empirical data, we demonstrated that a prediction model developed automatically by GP has predictive ability comparable to that of manually tuned Cox regression. The GP model was more complex, but it was developed in a fully automated way and comprised fewer covariates. Furthermore, it did not require the expertise normally needed for its derivation, thereby alleviating the knowledge elicitation bottleneck. Overall, GP demonstrated considerable potential as a method for the automated development of clinical prediction models for diagnostic and prognostic purposes.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。