Prediction of prognosis and survival of patients with gastric cancer by a weighted improved random forest model: an application of machine learning in medicine

利用加权改进随机森林模型预测胃癌患者的预后和生存期:机器学习在医学领域的应用

阅读:1

Abstract

INTRODUCTION: It is essential to predict the survival status of patients based on their prognosis. This can assist physicians in evaluating treatment decisions. Random forest is an excellent machine learning algorithm even without any modification. We propose a new random forest weighting method and apply it to the gastric cancer patient data from the Surveillance, Epidemiology, and End Results (SEER) program. We evaluated the generalization ability of this weighted random forest algorithm on 10 public medical datasets. Furthermore, for the same weighting mode, the difference between using out-of-bag (OOB) data and all training sets as the weighting basis is explored. MATERIAL AND METHODS: 110 697 cases of gastric cancer patients diagnosed between 1975 and 2016 obtained from the SEER database were included in the experiment. In addition, 10 public medical datasets were used for the generalization ability evaluation of this weighted random forest algorithm. RESULTS: Through experimental verification, on the SEER gastric cancer patient data, the weighted random forest algorithm improves the accuracy by 0.79% compared with the original random forest. In AUC, macro-averaging increased by 2.32% and micro-averaging increased by 0.51% on average. Among the 10 public datasets, the random forest weighted in accuracy has the best performance on 6 datasets, with an average increase of 1.44% in accuracy and an average increase of 1.2% in AUC. CONCLUSIONS: Compared with the original random forest, the weighted random forest model shows a significant improvement in performance, and the effect of using all training data as the weighting basis is better than using OOB data.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。