Published June 12, 2023 | Version 1
Dataset Open

Dataset for : A New Era in Software Security: Towards Self-Healing Software via Large Language Models and Formal Verification

  • 1. The University of Manchester, UK
  • 2. Technology Innovation Institute, UAE

Description

We present a novel solution combining Large Language Model (LLM) capabilities with Formal Verification strategies to falsify and automatically repair software vulnerabilities. Initially, we employ Bounded Model Checking (BMC) to locate the software vulnerability and derive a counterexample. Relying on mathematical proofs, counterexamples provide evidence that the system behaves incorrectly or contains a vulnerability, thereby preventing the generation of false positive alerts. The counterexample that has been detected, along with the source code, are provided to the LLM engine. Our approach involves establishing a specialized prompt language for conducting code debugging and generation to understand the vulnerability's root cause and repair the code. Finally, we use BMC to verify the corrected version of the code generated by the LLM. As a proof of concept, we create \esbmcai based on the Efficient SMT-based Context-Bounded Model Checker (ESBMC) and a pre-trained Transformer model, specifically gpt-3.5-turbo, to detect and fix errors in C programs. We generated a dataset comprising $1{,}000$ C code samples, each consisting of $20$ to $50$ lines of C code. Experimental results show that our proposed method achieved an impressive success rate of up to $80$\% in repairing vulnerable code, encompassing buffer overflow, arithmetic overflow, and pointer dereference failures. To our knowledge, \esbmcai represents the first proposal for a pioneering initiative to integrate a Large Language Model (LLM) with software model checking. We advocate that this automated approach has the potential to incorporate into the software development lifecycle's continuous integration and deployment (CI/CD) process. 

 

The uploaded dataset contains 1000 codes,  each comprising 20 to 50 lines of C code generated with gpt-3.5-turbo. The material also consists of a version of ESBMC statically compiled with all dependencies, a classifier script, and the output file.

 

 

Files

ESBMC-LLM.zip

Files (110.4 MB)

Name Size Download all
md5:cff61e2a80d972394d4a3b4d8bfc4693
110.4 MB Preview Download
md5:ad1b4dc329274c78f6dafbdd1b732717
3.7 kB Download
md5:ad887478717a5704baca8e726d65c39e
9.2 kB Preview Download
md5:1e74bc6d09782f584f7a3bfdca7f495d
541 Bytes Preview Download

Additional details

Related works

Is supplement to
Peer review: 10.48550/arXiv.2305.14752 (DOI)