Deep Learning: A Heuristic Three-Stage Mechanism for Grid Searches to Optimize the Future Risk Prediction of Breast Cancer Metastasis Using EHR-Based Clinical Data

深度学习:一种基于启发式三阶段网格搜索机制,用于优化基于电子病历临床数据的乳腺癌转移未来风险预测

阅读:1

Abstract

Background: A grid search, at the cost of training and testing a large number of models, is an effective way to optimize the prediction performance of deep learning models. A challenging task concerning grid search is time management. Without a good time management scheme, a grid search can easily be set off as a "mission" that will not finish in our lifetime. In this study, we introduce a heuristic three-stage mechanism for managing the running time of low-budget grid searches with deep learning, sweet-spot grid search (SSGS) and randomized grid search (RGS) strategies for improving model prediction performance, in an application of predicting the 5-year, 10-year, and 15-year risk of breast cancer metastasis. Methods: We develop deep feedforward neural network (DFNN) models and optimize the prediction performance of these models through grid searches. We conduct eight cycles of grid searches in three stages, focusing on learning a reasonable range of values for each of the adjustable hyperparameters in Stage 1, learning the sweet-spot values of the set of hyperparameters and estimating the unit grid search time in Stage 2, and conducting multiple cycles of timed grid searches to refine model prediction performance with SSGS and RGS in Stage 3. We conduct various SHAP analyses to explain the prediction, including a unique type of SHAP analyses to interpret the contributions of the DFNN-model hyperparameters. Results: The grid searches we conducted improved the risk prediction of 5-year, 10-year, and 15-year breast cancer metastasis by 18.6%, 16.3%, and 17.3%, respectively, over the average performance of all corresponding models we trained using the RGS strategy. Conclusions: Grid search can greatly improve model prediction. Our result analyses not only demonstrate best model performance but also characterize grid searches from various aspects such as their capabilities of discovering decent models and the unit grid search time. The three-stage mechanism worked effectively. It not only made our low-budget grid searches feasible and manageable but also helped improve the model prediction performance of the DFNN models. Our SHAP analyses not only identified clinical risk factors important for the prediction of future risk of breast cancer metastasis, but also DFNN-model hyperparameters important to the prediction of performance scores.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。