Update README.md

c65cb93 verified about 2 months ago

6.6 kB

metadata

license: cc-by-nc-4.0
language:
  - sk
base_model:
  - Qwen/Qwen3-0.6B
pipeline_tag: text-generation
datasets:
  - kinit/synthetic-queries-and-ml-instructions
extra_gated_prompt: |
  Access Request Terms:
    By requesting access to the SensAI model, you confirm that you:
      - will use the materials solely for research and non-commercial purposes;
      - will cite the SensAI project and respect the [CC-BY-NC-4.0 License](https://creativecommons.org/licenses/by-nc/4.0/);
      - will not attempt to extract, infer, or reconstruct data from the model or dataset;
      - will ensure that your downstream use complies with applicable laws, regulations, and ethical AI principles.

Qwen3-0.6B QLoRA Adapter

This repository contains a QLoRA adpater for Qwen3-0.6B.

Model was fine-tuned for instruction-following task ("extract metadata") using synthetic dataset called synthetic-queries-and-ml-instructions.

Description

Base model: Qwen3-0.6B
Adapter: QLoRA
Task: extract metadata
Quantization: 4-bit
GPU: NVIDIA RTX 3090
Dataset: synthetic-queries-and-ml-instructions.

LoRA Configuration

Parameter	Value
`r`	32
`lora_alpha`	16
`lora_dropout`	0.05
`bias`	none
`task_type`	CAUSAL_LM
`target_modules`	q_proj, k_proj, v_proj, o_proj

Training Parameters

Parameter	Value
Max sequence length	4096
Epochs	2
Learning rate	7e-4
Train batch size per device	4
Eval batch size per device	8
Gradient accumulation steps	4
Optimizer	paged_adamw_8bit
Scheduler	cosine
Warmup ratio	0.03
FP16	True
Save & Eval steps	100
Early stopping patience	3

Evaluation Results

We evaluated the Qwen3-0.6B model and our fine-tuned model on the test part (1.5K rows) of the dataset mentioned in the description.

Metric	Qwen3-0.6B	Qwen3-0.6B + QLoRA (Fine-tuned)
Invalidly parsed (%)	47.8	0.27
Complete accuracy (%)	0.47	80.6
Missing attributes (%)	41.73	7.93
Extra attributes (%)	32.27	6.53
Incorrect attributes (%)	41.4	5.4

Invalidly parsed: The percentage of examples where the model output had invalid/missing JSON format
Complete accuracy: The percentage of examples where all attributes in the output matched the ground truth attributes
Missing attributes: The percentage of examples where the model output is missing at least one attribute that is present in the ground truth example
Extra attributes: The percentage of examples where the model output contains attributes which are not present in ground truth example
Incorrect attributes: The percentage of examples where the model output has incorrect attributes compared to ground truth example.

The percentages for missing, extra and incorrect attributes may exceed 100% in total, since a single example can fall into multiple categories simultaneously. For instance, a model output could omit a required attribute (missing) while also adding an irrelevant one (extra).

How to use

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch


base_model_name = "Qwen/Qwen3-0.6B" 
ft_model_name = "kinit/qwen3-0.6B-extract-ml-instructions"

tokenizer = AutoTokenizer.from_pretrained(ft_model_name)

# Load base model with specific quantization set up first
quant_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_use_double_quant=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.bfloat16,
    )
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    quantization_config=quant_config,
    device_map="auto"
)
# Load LoRA adapter weights into the base model
model = PeftModel.from_pretrained(base_model, ft_model_name).eval()


# Preprocess input
prompt = "Chcem realizovať klasifikáciu ŠPZ čísel áut pomocou CNN architektúry za pomoci datasetu 'LicencePlates_ImageDataset'."
messages = [
    {"role": "user", "content": f"User query: {prompt}"}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Perform inference
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=1_024,
    temperature=0.25
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 

# Model response containing a tool call that needs to be parsed next
# Tool call arguments represent the extracted metadata
model_response = tokenizer.decode(output_ids, skip_special_tokens=True)

print(model_response)

Additional Information

This work was supported by the Výskumná Agentúra grant within the project SensAI - Morálna citlivosť a ľudské práva pre spracovanie jazykov s obmedzenými zdrojmi (Grant No. 09I01-03-V04-00100/2025/VA).

License & Attribution

This model was created within the SensAI project and is released under the CC-BY-NC-4.0 License. It is derivative of Qwen3-0.6B model with license: Apache license 2.0

Access Request Terms

Access to this repository is restricted. Please review and agree to the following terms before requesting access.