Machine learning algorithms predict breast cancer incidence risk: a data-driven retrospective study based on biochemical biomarkers

机器学习算法预测乳腺癌发病风险:一项基于生化生物标志物的数据驱动型回顾性研究

阅读:2

Abstract

BACKGROUND: Current breast cancer prediction models typically rely on personal information and medical history, with limited inclusion of blood-based biomarkers. This study aimed to identify novel breast cancer risk factors using machine learning algorithms. By integrating both personal clinical factors and peripheral blood biochemical biomarkers, it sought to enhance the understanding of breast cancer risk. METHODS: Data were screened and normalized according to predefined inclusion and exclusion criteria. Logistic regression with forward selection and six other machine learning algorithms were employed to identify variables associated with breast cancer incidence. The performance of the models was evaluated using the area under the curve (AUC) through 5-fold cross-validation. RESULTS: The data were divided into a training cohort of 17,360 cases and a testing cohort of 8,551 cases. Logistic regression analysis revealed that breast cancer incidence was increased with age (odds ratio [OR]:1.136, 95% confidence interval [CI]: [1.130, 1.142], P < 0.001), gamma-glutamyl transferase (GGT) (OR: 1.002, 95% CI: [1.000, 1.004], P = 0.014), and alanine transaminase (ALT) (OR: 1.005, 95% CI: [1.001, 1.008], P = 0.008). Furthermore, the six machine learning algorithms consistently identified GGT and ALT as the most significant predictive features. The AUC values obtained from the six models after 5-fold cross-validation ranged from 0.779 to 0.862, with accuracy ranging from 0.780 to 0.841. CONCLUSIONS: Our study identified two biochemical biomarkers (GGT and ALT) as promising indicators for breast cancer prediction. Incorporating these findings into a tailored breast cancer risk prediction model is needed in our future research.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。