A novel aggregated coefficient ranking based feature selection strategy for enhancing the diagnosis of breast cancer classification using machine learning

一种基于聚合系数排序的新型特征选择策略,用于增强机器学习在乳腺癌分类诊断中的应用

阅读:1

Abstract

Effective Breast cancer (BC) analysis is crucial for early prognosis, controlling cancer recurrence, timely medical intervention, and determining appropriate treatment procedures. Additionally, it plays a significant role in optimizing mortality rates among women with breast cancer and increasing the average lifespan of patients. This can be achieved by performing effective critical feature analysis of the BC by picking superlative features through significant ranking-based Feature Selection (FS). Various authors have developed strategies relying on single FS, but this approach may not yield excellent results and could lead to various consequences, including time and storage complexity issues, inaccurate results, poor decision-making, and difficult interpretation of models. Therefore, critical data analysis can facilitate the development of a robust ranking methodology for effective feature selection. To solve these problems, this paper suggests a new method called Aggregated Coefficient Ranking-based Feature Selection (ACRFS), which is based on tri chracteristic behavioral criteria. This strategy aims to significantly improve the ranking for an effective Attribute Subset Selection (ASSS). The proposed method utilized computational problem solvers such as chi-square, mutual information, correlation, and rank-dense methods. The work implemented the introduced methodology using Wisconsin-based breast cancer data and applied the Synthetic Minority Oversampling Technique (SMOTE) to the obtained data subset. Later, we employed models such as decision trees, support vector machines, k-nearest neighbors, random forests, stochastic gradient descent, and Gaussian naive bayes to determine the type of cancer. The classification metrics such as accuracy, precision, recall, F1 score, kappa score, and Matthews coefficient were utilized to evaluate the effectiveness of the suggested ACRFS approach. The proposed method has demonstrated superior outcomes with fewer features and a minimal time complexity.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。