Analyzing and comparing the effectiveness of malware detection: A study of machine learning approaches

分析和比较恶意软件检测的有效性:机器学习方法研究

阅读:1

Abstract

The Internet has become a vital source of knowledge and communication in recent times. Continuous technological advancements have changed the way businesses operate, and everyone today lives in the digital world of engineering. Because of the Internet of Things (IoT) and its applications, people's impressions of the information revolution have improved. Malware detection and categorization are becoming more of a problem in the cybersecurity world. As a result, strong security on the Internet could protect billions of internet users from harmful behavior. In malware detection and classification techniques, several types of deep learning models are used; however, they still have limitations. This study will explore malware detection and classification elements using modern machine learning (ML) approaches, including K-Nearest Neighbors (KNN), Extra Tree (ET), Random Forest (RF), Logistic Regression (LR), Decision Tree (DT), and neural network Multilayer Perceptron (nnMLP). The proposed study uses the publicly available dataset UNSWNB15. In our proposed work, we applied the feature encoding method to convert our dataset into purely numeric values. After that, we applied a feature selection method named Term Frequency-Inverse Document Frequency (TFIDF) based on entropy for the best feature selection. The dataset is then balanced and provided to the ML models for classification. The study concludes that Random Forest, out of all tested ML models, yielded the best accuracy of 97.68 %.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。