Text Generation
Safetensors
Slovak
conversational
knazekova's picture
Update README.md
c65cb93 verified
metadata
license: cc-by-nc-4.0
language:
  - sk
base_model:
  - Qwen/Qwen3-0.6B
pipeline_tag: text-generation
datasets:
  - kinit/synthetic-queries-and-ml-instructions
extra_gated_prompt: |
  Access Request Terms:
    By requesting access to the SensAI model, you confirm that you:
      - will use the materials solely for research and non-commercial purposes;
      - will cite the SensAI project and respect the [CC-BY-NC-4.0 License](https://creativecommons.org/licenses/by-nc/4.0/);
      - will not attempt to extract, infer, or reconstruct data from the model or dataset;
      - will ensure that your downstream use complies with applicable laws, regulations, and ethical AI principles.

Qwen3-0.6B QLoRA Adapter

This repository contains a QLoRA adpater for Qwen3-0.6B.

Model was fine-tuned for instruction-following task ("extract metadata") using synthetic dataset called synthetic-queries-and-ml-instructions.

Description

LoRA Configuration

Parameter Value
r 32
lora_alpha 16
lora_dropout 0.05
bias none
task_type CAUSAL_LM
target_modules q_proj, k_proj, v_proj, o_proj

Training Parameters

Parameter Value
Max sequence length 4096
Epochs 2
Learning rate 7e-4
Train batch size per device 4
Eval batch size per device 8
Gradient accumulation steps 4
Optimizer paged_adamw_8bit
Scheduler cosine
Warmup ratio 0.03
FP16 True
Save & Eval steps 100
Early stopping patience 3

Evaluation Results

We evaluated the Qwen3-0.6B model and our fine-tuned model on the test part (1.5K rows) of the dataset mentioned in the description.

Metric Qwen3-0.6B Qwen3-0.6B + QLoRA (Fine-tuned)
Invalidly parsed (%) 47.8 0.27
Complete accuracy (%) 0.47 80.6
Missing attributes (%) 41.73 7.93
Extra attributes (%) 32.27 6.53
Incorrect attributes (%) 41.4 5.4
  • Invalidly parsed: The percentage of examples where the model output had invalid/missing JSON format
  • Complete accuracy: The percentage of examples where all attributes in the output matched the ground truth attributes
  • Missing attributes: The percentage of examples where the model output is missing at least one attribute that is present in the ground truth example
  • Extra attributes: The percentage of examples where the model output contains attributes which are not present in ground truth example
  • Incorrect attributes: The percentage of examples where the model output has incorrect attributes compared to ground truth example.

The percentages for missing, extra and incorrect attributes may exceed 100% in total, since a single example can fall into multiple categories simultaneously. For instance, a model output could omit a required attribute (missing) while also adding an irrelevant one (extra).

How to use

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch


base_model_name = "Qwen/Qwen3-0.6B" 
ft_model_name = "kinit/qwen3-0.6B-extract-ml-instructions"

tokenizer = AutoTokenizer.from_pretrained(ft_model_name)

# Load base model with specific quantization set up first
quant_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_use_double_quant=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.bfloat16,
    )
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    quantization_config=quant_config,
    device_map="auto"
)
# Load LoRA adapter weights into the base model
model = PeftModel.from_pretrained(base_model, ft_model_name).eval()


# Preprocess input
prompt = "Chcem realizovať klasifikáciu ŠPZ čísel áut pomocou CNN architektúry za pomoci datasetu 'LicencePlates_ImageDataset'."
messages = [
    {"role": "user", "content": f"User query: {prompt}"}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Perform inference
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=1_024,
    temperature=0.25
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 

# Model response containing a tool call that needs to be parsed next
# Tool call arguments represent the extracted metadata
model_response = tokenizer.decode(output_ids, skip_special_tokens=True)

print(model_response)

Additional Information

This work was supported by the Výskumná Agentúra grant within the project SensAI - Morálna citlivosť a ľudské práva pre spracovanie jazykov s obmedzenými zdrojmi (Grant No. 09I01-03-V04-00100/2025/VA).

License & Attribution

This model was created within the SensAI project and is released under the CC-BY-NC-4.0 License. It is derivative of Qwen3-0.6B model with license: Apache license 2.0

Access Request Terms

Access to this repository is restricted. Please review and agree to the following terms before requesting access.