--- license: cc-by-nc-4.0 language: - sk base_model: - Qwen/Qwen3-0.6B pipeline_tag: text-generation datasets: - kinit/synthetic-queries-and-ml-instructions extra_gated_prompt: | Access Request Terms: By requesting access to the SensAI model, you confirm that you: - will use the materials solely for research and non-commercial purposes; - will cite the SensAI project and respect the [CC-BY-NC-4.0 License](https://creativecommons.org/licenses/by-nc/4.0/); - will not attempt to extract, infer, or reconstruct data from the model or dataset; - will ensure that your downstream use complies with applicable laws, regulations, and ethical AI principles. --- # Qwen3-0.6B QLoRA Adapter This repository contains a QLoRA adpater for [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B). Model was fine-tuned for instruction-following task ("extract metadata") using synthetic dataset called [synthetic-queries-and-ml-instructions](https://huggingface.co/datasets/kinit/synthetic-queries-and-ml-instructions). ## Description - Base model: Qwen3-0.6B - Adapter: QLoRA - Task: extract metadata - Quantization: 4-bit - GPU: NVIDIA RTX 3090 - Dataset: [synthetic-queries-and-ml-instructions](https://huggingface.co/datasets/kinit/synthetic-queries-and-ml-instructions). ## LoRA Configuration | Parameter | Value | |------------------|--------------------------------------| | `r` | 32 | | `lora_alpha` | 16 | | `lora_dropout` | 0.05 | | `bias` | none | | `task_type` | CAUSAL_LM | | `target_modules` | q_proj, k_proj, v_proj, o_proj | ## Training Parameters | Parameter | Value | |---------------------------------|------------| | Max sequence length | 4096 | | Epochs | 2 | | Learning rate | 7e-4 | | Train batch size per device | 4 | | Eval batch size per device | 8 | | Gradient accumulation steps | 4 | | Optimizer | paged_adamw_8bit | | Scheduler | cosine | | Warmup ratio | 0.03 | | FP16 | True | | Save & Eval steps | 100 | | Early stopping patience | 3 | ## Evaluation Results We evaluated the Qwen3-0.6B model and our fine-tuned model on the test part (1.5K rows) of the dataset mentioned in the description. | Metric | Qwen3-0.6B | Qwen3-0.6B + QLoRA (Fine-tuned) | |-------------------------------------|-----------------|-------------------------------| | Invalidly parsed (%) | 47.8 | 0.27 | | Complete accuracy (%) | 0.47 | 80.6 | | Missing attributes (%) | 41.73 | 7.93 | | Extra attributes (%) | 32.27 | 6.53 | | Incorrect attributes (%) | 41.4 | 5.4 | - Invalidly parsed: The percentage of examples where the model output had invalid/missing JSON format - Complete accuracy: The percentage of examples where all attributes in the output matched the ground truth attributes - Missing attributes: The percentage of examples where the model output is missing at least one attribute that is present in the ground truth example - Extra attributes: The percentage of examples where the model output contains attributes which are not present in ground truth example - Incorrect attributes: The percentage of examples where the model output has incorrect attributes compared to ground truth example. The percentages for missing, extra and incorrect attributes may exceed 100% in total, since a single example can fall into multiple categories simultaneously. For instance, a model output could omit a required attribute (missing) while also adding an irrelevant one (extra). ## How to use ```python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig from peft import PeftModel import torch base_model_name = "Qwen/Qwen3-0.6B" ft_model_name = "kinit/qwen3-0.6B-extract-ml-instructions" tokenizer = AutoTokenizer.from_pretrained(ft_model_name) # Load base model with specific quantization set up first quant_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, ) base_model = AutoModelForCausalLM.from_pretrained( base_model_name, quantization_config=quant_config, device_map="auto" ) # Load LoRA adapter weights into the base model model = PeftModel.from_pretrained(base_model, ft_model_name).eval() # Preprocess input prompt = "Chcem realizovať klasifikáciu ŠPZ čísel áut pomocou CNN architektúry za pomoci datasetu 'LicencePlates_ImageDataset'." messages = [ {"role": "user", "content": f"User query: {prompt}"} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, enable_thinking=False ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) # Perform inference generated_ids = model.generate( **model_inputs, max_new_tokens=1_024, temperature=0.25 ) output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() # Model response containing a tool call that needs to be parsed next # Tool call arguments represent the extracted metadata model_response = tokenizer.decode(output_ids, skip_special_tokens=True) print(model_response) ``` ## Additional Information This work was supported by the Výskumná Agentúra grant within the project SensAI - Morálna citlivosť a ľudské práva pre spracovanie jazykov s obmedzenými zdrojmi (Grant No. 09I01-03-V04-00100/2025/VA). ### License & Attribution This model was created within the SensAI project and is released under the [CC-BY-NC-4.0 License](https://creativecommons.org/licenses/by-nc/4.0/). It is derivative of Qwen3-0.6B model with license: [Apache license 2.0](https://huggingface.co/Qwen/Qwen3-0.6B/blob/main/LICENSE) ### Access Request Terms Access to this repository is restricted. Please review and agree to the following terms before requesting access.