R2Vul: Learning to Reason about Software Vulnerabilities with Reinforcement Learning and Structured Reasoning Distillation
Paper β’ 2504.04699 β’ Published
A fine-tuned code security model that detects vulnerabilities and generates fixes across multiple programming languages.
Built on Qwen2.5-Coder-7B-Instruct with QLoRA fine-tuning on 90K+ real-world vulnerability-fix pairs from CVE/CWE databases.
Given any code snippet, this model will:
| Component | Details |
|---|---|
| Base Model | Qwen/Qwen2.5-Coder-7B-Instruct |
| Method | QLoRA (4-bit NF4 quantization) |
| LoRA Config | r=16, Ξ±=32, dropout=0.05 |
| Target Modules | q, k, v, o, gate, up, down projections |
| Training | SFT with assistant-only loss |
| Max Length | 2048 tokens |
Combined from 3 curated vulnerability datasets totaling ~90K samples:
| Dataset | Samples | Languages | Source |
|---|---|---|---|
| MegaVul | ~17K | C/C++ | 992 repos, 169 CWE types, 2006-2023 |
| TitanVul | ~38K | C, C++, Java, Python, JS | Aggregated from 7 sources, deduplicated |
| CleanVul | ~26K | Multi-language | LLM-filtered, vulnerability_score β₯ 1 |
| Safe samples | ~12K | Multi-language | Fixed code from TitanVul (negative examples) |
vulnerability_score >= 1 (removes ~27% noise)pip install transformers peft torch bitsandbytes accelerate
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
# Load model
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-Coder-7B-Instruct",
quantization_config=bnb_config,
device_map="auto",
)
model = PeftModel.from_pretrained(base_model, "jacobmahon/zero-day-exploit-scanner-fixer")
tokenizer = AutoTokenizer.from_pretrained("jacobmahon/zero-day-exploit-scanner-fixer")
# Scan code
messages = [
{"role": "system", "content": "You are a security expert. Analyze code for vulnerabilities and provide fixes."},
{"role": "user", "content": "Analyze this C code for vulnerabilities:\n```c\nvoid process(char *input) {\n char buf[64];\n strcpy(buf, input);\n}\n```"},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.3, top_p=0.9)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
# Scan a code string
python inference.py --code "char buf[10]; gets(buf);"
# Scan a file
python inference.py --file vulnerable.c
# Interactive mode
python inference.py --interactive
The model has been trained on 169+ CWE types including:
| Category | CWE Examples |
|---|---|
| Memory Safety | CWE-119 (Buffer Overflow), CWE-120 (Buffer Copy), CWE-416 (Use After Free), CWE-476 (NULL Pointer Deref) |
| Injection | CWE-79 (XSS), CWE-89 (SQL Injection), CWE-78 (OS Command Injection) |
| Authentication | CWE-287 (Improper Auth), CWE-306 (Missing Auth), CWE-798 (Hardcoded Credentials) |
| Cryptography | CWE-327 (Broken Crypto), CWE-330 (Insufficient Randomness) |
| Race Conditions | CWE-362 (Race Condition), CWE-367 (TOCTOU) |
| Input Validation | CWE-20 (Improper Input Validation), CWE-190 (Integer Overflow) |
| Access Control | CWE-862 (Missing Authorization), CWE-863 (Incorrect Authorization) |
| Information Disclosure | CWE-200 (Info Exposure), CWE-209 (Error Message Info Leak) |
Based on research from:
learning_rate = 2e-4 # LoRA-optimized (10x base)
num_train_epochs = 3
per_device_train_batch_size = 2
gradient_accumulation_steps = 8 # Effective batch = 16
max_length = 2048
lr_scheduler = "cosine"
warmup_steps = 100
optimizer = "adamw_torch"
quantization = "4-bit NF4 (double quant)"
lora_rank = 16
lora_alpha = 32
lora_dropout = 0.05
Recommended evaluation benchmarks:
To reproduce or fine-tune further:
# Install dependencies
pip install transformers trl torch datasets trackio accelerate peft bitsandbytes
# Run training (requires 24GB+ GPU)
python train.py
See train.py in this repository for the full training script.
Apache 2.0