Explainable Machine Learning Models for Colorectal Cancer Prediction Using Clinical Laboratory Data

利用临床实验室数据构建用于结直肠癌预测的可解释机器学习模型

阅读:1

Abstract

IntroductionEarly diagnosis of colorectal cancer (CRC) poses a significant clinical challenge. This study aims to develop machine learning (ML) models for CRC risk prediction using clinical laboratory data.MethodsThis retrospective, single-center study analyzed laboratory examination data from healthy controls (HC), polyp patients (Polyp), and CRC patients between 2013 and 2023. Five ML algorithms, including adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), decision tree (DT), logistic regression (LR), and random forest (RF), were employed to classify subjects into HC vs Polyp vs CRC, HC vs CRC, and Polyp vs CRC, respectively.ResultsThis study included 31 539 subjects: 11 793 HCs, 10 125 polyp patients, and 9621 CRC patients. The XGBoost model achieved the highest AUCs of 0.966 for differentiating HC from CRC and 0.881 for Polyp from CRC, outperforming carcino-embryonic antigen (CEA) and fecal occult blood testing (FOBT) tests. This model could also identify CEA-negative or FOBT-negative CRC patients. Incorporating stool miR-92a detection into the model further improved diagnostic performance. Shapley additive explanations (SHAP) plots indicated that FOBT, CEA, lymphocyte percentage (LYMPH%), and hematocrit (HCT) were the most significant features contributing to CRC diagnosis. Additionally, a computational tool for predicting CRC risk based on the optimal model was developed, designed for researchers with programming experience.ConclusionFive ML models for CRC diagnosis, based on ten routine laboratory test items, were developed, achieving higher diagnostic accuracies than traditional CRC biomarkers. The diagnostic capabilities of these ML models can be further enhanced by including stool miR-92a levels.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。