Graph neural networks embedded with domain knowledge for cyber threat intelligence entity and relationship mining

嵌入领域知识的图神经网络用于网络威胁情报实体和关系挖掘

阅读:1

Abstract

The escalating frequency and severity of cyber-attacks have presented formidable challenges to the safeguarding of cyberspace. Named Entity Recognition (NER) technology is utilized for the rapid identification of threat entities and their relationships within cyber threat intelligence, enabling security researchers to be promptly informed of the occurrence of cyber threats, thereby enhancing the efficiency of security defense and analysis. However, current models for identifying network threat entities and extracting relationships suffer from limitations such as the inadequate representation of textual semantic information, insufficient granularity in threat entity recognition, and errors in relationship extraction propagation. To address these issues, this article proposes a novel model for Network Threat Entity Recognition and Relationship Extraction (CtiErRe). Additionally, it redefines seven network threat entities and two types of relationships between threat entities. Specifically, first, domain knowledge is collected to build a domain knowledge graph, which is then embedded using graph convolutional networks (GCN) to enhance the feature representation of threat intelligence text. Next, the features from domain knowledge graph embedding and those generated by the bidirectional encoder representations from transformers (BERT) model are fused using the Layernorm algorithm. Finally, the fused features are processed using the GlobalPointer algorithm to generate both the threat entity type matrix and the threat entity relation type matrix, thereby enabling the identification of threat entities and their relationships. To validate our proposed model, we conducted extensive experiments, and the results demonstrate its superiority over existing models. Our model performs remarkably in threat entity recognition tasks, with accuracy and F1 scores reaching 92.13% and 93.11%, respectively. In the relationship extraction task, our model achieves accuracy and F1 scores of 91.45% and 92.45%, respectively.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。