KerSpecGen: Co-piloting formal Kernel specification synthesis with refined knowledge graphs and large language models

KerSpecGen:利用精细的知识图谱和大型语言模型协同驱动形式化内核规范合成

阅读:1

Abstract

Formal verification ensures software correctness but faces challenges in kernel specification writing, which is labor-intensive, expertise-dependent, and limited to specific targets. For complex microkernels like seL4, these issues significantly reduce the practicality of formal methods. To address this, we propose KerSpecGen, a Large Language Model (LLM) based framework for synthesizing formal kernel specifications, which leverages knowledge graphs to bridge requirement descriptions and specification properties. KerSpecGen's core design includes three key components: 1) constructing a refined knowledge graph to represent progressive specification relationships (with deduplication, property judgment, and source annotation to improve accuracy); 2) building a custom dataset to fine-tune a model that converts specification properties into code; 3) designing LLM-oriented synthesis templates to transform generated code into complete candidate specification programs. Evaluation on our KerSpecProperty benchmark (30 seL4 function modules, 624 properties) shows KerSpecGen outperforms the Few-shot method by an average of 7.06 percentage points in BLEU-4. Specifically, for the Llama-3.1-405B model, KerSpecGen achieves better specification quality than the Few-shot method in 23 out of 30 functional modules. To our knowledge, this is the first work to generate formal specifications for complex microkernels. KerSpecGen yields high-quality reference specifications. While requiring minor adjustments for direct execution, it substantially reduces the burden of writing specifications from scratch.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。