Development and validation of machine learning models based on blood routine tests and tumor markers in early screening of primary bronchogenic lung cancer

基于血液常规检查和肿瘤标志物的机器学习模型在原发性支气管肺癌早期筛查中的开发和验证

阅读:1

Abstract

BACKGROUND: Primary bronchogenic lung cancer (PBLC) poses a serious threat to human health with its high mortality rate largely attributed to challenges in reliable early detection. Hence, the early identification of PBLC is essential for subsequent patient treatment. Machine learning (ML) models that utilize accessible data, such as routine blood tests and tumor markers, present a promising approach for enhancing early screening rates. This study aims to construct an ML prediction model based on the combined analysis of routine blood tests and tumor markers and to establish an early intelligent screening platform for PBLC through systematic integration and development of technology so as to improve the early screening rate of PBLC. METHODS: This study used samples from the PBLC group and the healthy control (HC) group from 2018 to 2023 (n=1,054). Data from The Affiliated Dazu's Hospital of Chongqing Medical University were used for model construction and internal validation (n=767), and data from the Chongqing Dazu District People's Hospital Medical Community were used for external validation (n=287). After feature selection using the least absolute shrinkage and selection operator (LASSO) algorithm, 14 features were selected, including routine blood tests and tumor markers. Subsequently, 10 ML models were used to establish prediction models using eight evaluation metrics, including accuracy, sensitivity, specificity, and area under the curve (AUC), to develop an early PBLC prediction tool. RESULTS: Among multiple ML models for early prediction of PBLC in patients, the Xtreme Gradient Boosting (XGBoost) model achieved an AUC above 0.980 in both internal and external validation. Basophils, lymphocytes, and carcinoembryonic antigen (CEA) ranked highest in feature importance for early PBLC prediction, suggesting that the indicators from routine blood tests and tumor markers jointly influence the predictive performance, thereby underscoring the practicality of integrating these two types of indicators in model development. CONCLUSIONS: The ML models developed possess substantial application value in the early screening of PBLC, which is beneficial for the prompt detection and treatment of individuals diagnosed with PBLC.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。