Optimization of Tree-Based Machine Learning Models to Predict the Length of Hospital Stay Using Genetic Algorithm

利用遗传算法优化基于树的机器学习模型以预测住院时长

阅读:1

Abstract

The length of hospital stay (LOS) is a significant indicator of the quality of patient care, hospital efficiency, and operational resilience. Considering the importance of LOS in hospital resource management, this research aims to improve the accuracy of LOS prediction using hyperparameter optimization (HPO). Expert physicians and related studies were reviewed to determine the variables affecting LOS. The electronic medical records of 200 patients in the department of internal medicine of a hospital in Iran were collected randomly. As the performance of machine learning (ML) models can vary based on the characteristics of the features, several models were applied and evaluated in this study. In particular, k-nearest neighbors (KNN), multivariate regression, decision tree (DT), random forest (RF), artificial neural network (ANN), and XGBoost have been evaluated and improved. The genetic algorithm (GA) was applied to optimize the tree-based models. In addition, the dummy coding technique, sometimes called the One-Hot encoding, was used to encode categorical features to increase prediction accuracy. Compared with other algorithms, the XGBoost model optimized by GA (XGB_GA) achieved higher accuracy and better prediction performance. The mean and median of absolute errors in the test dataset for this model were 1.54 and 1.14 days, respectively. In other words, the XGB_GA model reduced the mean absolute error by 37%, which is beneficial in the reliable design of a clinical decision support system.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。