Obfuscated Malware Detection and Classification in Network Traffic Leveraging Hybrid Large Language Models and Synthetic Data

利用混合大型语言模型和合成数据对网络流量中的混淆恶意软件进行检测和分类

阅读:1

Abstract

Android malware detection remains a critical issue for mobile security. Cybercriminals target Android since it is the most popular smartphone operating system (OS). Malware detection, analysis, and classification have become diverse research areas. This paper presents a smart sensing model based on large language models (LLMs) for developing and classifying network traffic-based Android malware. The network traffic that constantly connects Android apps may contain harmful components that may damage these apps. However, one of the main challenges in developing smart sensing systems for malware analysis is the scarcity of traffic data due to privacy concerns. To overcome this, a two-step smart sensing model Syn-detect is proposed. The first step involves generating synthetic TCP malware traffic data with malicious content using GPT-2. These data are then preprocessed and used in the second step, which focuses on malware classification. This phase leverages a fine-tuned LLM, Bidirectional Encoder Representations from Transformers (BERT), with classification layers. BERT is responsible for tokenization, generating word embeddings, and classifying malware. The Syn-detect model was tested on two Android malware datasets: CIC-AndMal2017 and CIC-AAGM2017. The model achieved an accuracy of 99.8% on CIC-AndMal2017 and 99.3% on CIC-AAGM2017. The Matthew's Correlation Coefficient (MCC) values for the predictions were 99% for CIC-AndMal2017 and 98% for CIC-AAGM2017. These results demonstrate the strong performance of the Syn-detect smart sensing model. Compared to the latest research in Android malware classification, the model outperformed other approaches, delivering promising results.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。