Machine learning-based prediction of gastroesophageal junction cancer using electronic medical records

基于机器学习的电子病历胃食管交界处癌预测

阅读:1

Abstract

Discriminating whether esophageal-related symptoms result from gastroesophageal junction cancer (GEJC) is challenging in clinical practice. This study aimed to develop and validate a tool to predict the likelihood of GEJC in patients with esophageal-related symptoms. The electronic medical record system was accessed to identify patients diagnosed with GEJC or gastroesophageal reflux disease (GERD) at our hospital between 2009 and 2023. Predictive variables included demographic characteristics, symptoms, and laboratory results. After propensity score matching, significant features of GEJC were screened using the least absolute shrinkage and selection operator (LASSO), Boruta, and logistic regression analysis. Patients were randomly divided into training and test cohorts in a 2:1 ratio. Four machine learning models were trained and validated for predicting GEJC patients. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC), residual analysis, calibration curve, and Brier score. Additionally, Shapley Additive exPlanations analysis was used to explain the importance of different features. After matching, 401 GEJC patients were enrolled and compared with 401 GERD controls. Using the variables identified by LASSO, Boruta, and logistic regression analysis, we constructed four machine learning models including random forest, generalized linear model, extreme gradient boosting (XGBoost), and support vector machine. XGBoost exhibited better predictive performance with an AUC of 0.907 in the test cohort. The calibration curve of the XGBoost model also demonstrated strong consistency with a Brier score of 0.088. Body mass index, hemoglobin, age, reflux, and dysphagia were found to be significant influences on the model output. We developed a well-performing model for predicting GEJC using electronic medical records. Implementing this prediction tool in clinical practice may guide diagnostic strategies and provide appropriate interventions.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。