Optimized KNN with domain-informed features and LIME explainability for improved breast cancer classification

基于领域信息特征和LIME可解释性的优化KNN算法可提高乳腺癌分类的准确性

阅读：1

作者：AlOmair,Omar,M Zakaria,Ossama,Alabdullatif,Mohammed,Khurshid,Zohaib

期刊：	BMC Medical Informatics and Decision Making	影响因子：	3.800
时间：	2026	起止号：	2026 Mar 16;26(1)
doi：	10.1186/s12911-026-03427-y

Abstract

BACKGROUND: Breast cancer remains one of the leading causes of cancer-related mortality among women worldwide, with more than 2.3 million new cases and approximately 670,000 deaths reported globally in 2022. Early and accurate diagnosis significantly improves survival rates; however, conventional diagnostic approaches are often time-consuming and subject to inter-observer variability. Although machine learning techniques have demonstrated promising results, many existing studies lack systematic hyperparameter optimization and robust strategies to improve model generalization. This study aimed to develop an optimized and interpretable K-Nearest Neighbour (KNN) framework for breast cancer classification. METHODS: The Breast Cancer Wisconsin (Diagnostic) Dataset (WDBC), comprising 569 samples with 32 features, was used for model development and evaluation. The proposed framework incorporated advanced preprocessing, biologically informed feature engineering, hybrid feature selection, and systematic hyperparameter tuning using GridSearchCV. An ensemble KNN model employing soft voting was introduced to enhance predictive stability and performance. Model interpretability was improved using the Local Interpretable Model-Agnostic Explanations (LIME) technique to identify feature contributions for malignant and benign classifications. RESULTS: The optimized KNN model achieved an accuracy of 98.25%, while the ensemble KNN model reached 99.12% accuracy. The proposed framework demonstrated high predictive performance, improved classification stability, and enhanced interpretability through feature-level explanation analysis. CONCLUSIONS: The findings demonstrate the methodological effectiveness of an optimized and ensemble-based KNN framework for breast cancer classification. While the results indicate strong benchmark performance on the WDBC dataset, the study primarily highlights methodological robustness rather than immediate clinical generalizability. Further validation on multi-center clinical datasets is required before practical deployment in decision-support systems.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。

肿瘤免疫

炎症

T细胞

转录调控

凋亡

线粒体

巨噬细胞

传染病

自噬

氧化应激

磷酸化

血管生成

肠道菌群

囊泡

中性粒细胞

单细胞

3D/类器官

药物研究

外泌体

细胞衰老

DNA甲基化

缺氧低氧

铁死亡

miRNA

乙酰化

组蛋白修饰

泛素化

炎性小体

代谢重编程

焦亡

树突状细胞

肿瘤微环境

m6A/m5C/m7G

空间多组学

lncRNA

细胞基因治疗

内质网应激

相分离

治疗耐药

Treg

上皮间质转化

免疫代谢

染色质重塑

脂质过氧化

蛋白质稳态

铁代谢

cGAS-STING

碱基编辑

乳酸化

低氧缺氧

脂代谢

蛋白降解

circRNA

肠脑轴

细胞极性

NK 细胞

piRNA

氨基酸代谢

翻译调控

MDSC

肿瘤异质性

RNA 编辑

NETosis

氧化脂质

溶酶体功能

细胞干性

琥珀酰化

CAR-NK

器官芯片

Tfh

冷应激

巴豆酰化

表观遗传记忆

铜死亡

器官纤维化

线粒体未折叠蛋白反应

空间代谢组

自噬流

程序性坏死

MAIT 细胞

丙酰化

肠肝轴