Pan-cancer predictive survival model development and evaluation using electronic health record and genetic data across 10 cancer types

利用电子健康记录和基因数据,开发和评估涵盖10种癌症类型的泛癌预测生存模型

阅读:2

Abstract

The growing burden of cancer and recent surge in healthcare data availability call for new ways of analysing this multifactorial disease and improving patient outcomes. The aim of this study is to develop and evaluate prognostic cancer survival models across ten common cancer types based on a large patient sample. We compare the performance of different machine learning algorithms and assess the added value of genetic information in cancer prognosis. We also provide ways to improve model explainabilty which is critical for model adoption in clinical practice. This study included data from 9977 patients with bladder, breast, colorectal, endometrial, glioma, leukaemia, lung, ovarian, prostate, and renal cancers. Genetic data collected through the 100,000 Genomes Project was linked with clinical and demographic data provided by the National Cancer Registration and Analysis Service, Hospital Episode Statistics and Office for National Statistics. More than 500 prognostic features were assessed and four machine learning algorithms including Elastic Net Cox proportional hazards regression, random survival forest, gradient boosting survival and DeepSurv neural network were developed in this study. Most models achieved good performance varying from 60% in bladder cancer to 80% in glioma with the average C-index of 72% across all cancer types. Different machine learning methods achieved similar performance with DeepSurv model slightly underperforming compared to other methods. Addition of genetic data improved performance in endometrial, glioma, ovarian and prostate cancers, showing its potential importance for cancer prognosis. Patient's age, stage, grade, referral route, waiting times, pre-existing conditions, previous hospital utilisation, tumour mutational burden and mutations in gene TP53 were among the most important features in cancer survival modelling. By offering a comprehensive set of predictive models for cancer survival, this study fills a critical gap in our understanding of cancer prognosis and provides new tools for informing cancer treatment and consequently improving patient outcomes.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。