Qwen2.5-1.5B-Instruct-KAI
Instruction
Llama3.2-1B-Instruct-KAI, Llama3.2-3B-Instruct-KAI, Qwen2.5-0.5B-Instruct-KAI, Qwen2.5-1.5B-Instruct-KAI, and Qwen2.5-3B-Instruct-KAI are a collection of models fine-tuned on the open Qwen2.5* and Llama3.2* models. They are optimized for Vietnamese language understanding and generation tasks such as reading comprehension, information extraction, question answering and summarization.
Quickstart
This is a demonstration of loading a model and performing a question-answering or summarization task.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "kiki-ailab/Qwen2.5-3B-Instruct-KAI"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Xin chร o !"
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
Examples
Example 1:
prompt = """Dฦฐแปi ฤรขy lร mแปt sแป tร i liแปu / vฤn bแบฃn:
<DOC id="doc-1">
Theo mแปt nghiรชn cแปฉu gแบงn ฤรขy, biแบฟn ฤแปi khรญ hแบญu ฤรฃ lร m gia tฤng tแบงn suแบฅt vร cฦฐแปng ฤแป cแปงa cรกc hiแปn tฦฐแปฃng thแปi tiแบฟt cแปฑc ฤoan, bao gแปm bรฃo, hแบกn hรกn vร lลฉ lแปฅt. Cรกc khu vแปฑc ven biแปn ฤรดng Nam ร cรณ nguy cฦก cao nhแบฅt do nฦฐแปc biแปn dรขng vร hiแปn tฦฐแปฃng xรขm nhแบญp mแบทn.
</DOC>
<DOC id="doc-2">
Mแปt bรกo cรกo tแปซ Ngรขn hร ng Thแบฟ giแปi cho thแบฅy rแบฑng biแบฟn ฤแปi khรญ hแบญu sแบฝ แบฃnh hฦฐแปng nghiรชm trแปng ฤแบฟn sแบฃn xuแบฅt nรดng nghiแปp, ฤแบทc biแปt lร แป cรกc nฦฐแปc ฤang phรกt triแปn, nฦกi nแปn kinh tแบฟ phแปฅ thuแปc lแปn vร o nรดng nghiแปp. Cแปฅ thแป, nฤng suแบฅt cรขy trแปng cรณ thแป giแบฃm tแปซ 10% ฤแบฟn 25% trong 30 nฤm tแปi.
</DOC>
<DOC id="doc-3">
Mแปt sรกng kiแบฟn quแปc tแบฟ ฤรฃ ฤฦฐแปฃc khแปi ฤแปng nhแบฑm giแบฃm thiแปu tรกc ฤแปng cแปงa biแบฟn ฤแปi khรญ hแบญu thรดng qua viแปc thรบc ฤแบฉy sแปญ dแปฅng nฤng lฦฐแปฃng tรกi tแบกo vร giแบฃm phรกt thแบฃi carbon. Cรกc nฦฐแปc phรกt triแปn ฤรฃ cam kแบฟt hแป trแปฃ tร i chรญnh cho cรกc quแปc gia dแป
bแป tแปn thฦฐฦกng nhแบฅt, nhฦฐng viแปc triแปn khai vแบซn gแบทp nhiแปu thรกch thแปฉc.
</DOC>
TASK: Hรฃy trแบฃ lแปi cรขu hแปi "Biแบฟn ฤแปi khรญ hแบญu แบฃnh hฦฐแปng nhฦฐ thแบฟ nร o ฤแบฟn nรดng nghiแปp แป cรกc nฦฐแปc ฤang phรกt triแปn?"
INSTRUCTION:
1. Cรขu trแบฃ lแปi khรดng quรก 50 tแปซ.
2. Trรญch dแบซn rรต rร ng tร i liแปu nร o chแปฉa thรดng tin liรชn quan, theo format: [doc-k]"""
Example 2:
prompt = """Trแบฃ lแปi cรขu hแปi dแปฑa vร o nแปi dung ฤoแบกn vฤn sau:
====
Bรฃo Milton bแบฏt ฤแบงu ฤแป bแป vร o Siesta Key, bang Florida, Mแปน, vแปi sแปฉc giรณ 193 km/h, tฦฐฦกng ฤฦฐฦกng cแบฅp 3 trong thang ฤo bรฃo 5 cแบฅp, vร o khoแบฃng 20h30 ngร y 9/10 (7h30 sรกng 10/10 giแป Hร Nแปi). Sau vร i tiแบฟng cร n quรฉt qua Florida, bรฃo Milton hแบก xuแปng cแบฅp 2 vร tiแบฟp tแปฅc hแบก xuแปng cแบฅp 1 vร o rแบกng sรกng 10/10.
ฤรขy lร cฦกn bรฃo thแปฉ nฤm แป Mแปน vร cฦกn bรฃo thแปฉ ba tแบฅn cรดng bang Florida trong nฤm nay. Trฦฐแปc khi bรฃo Milton ฤแป bแป, Thแปng ฤแปc Florida Ron DeSantis cho biแบฟt รญt nhแบฅt 19 cฦกn lแปc xoรกy ฤรฃ xuแบฅt hiแปn แป Florida vร 116 cแบฃnh bรกo lแปc xoรกy ฤฦฐแปฃc ban bแป khแบฏp bang.
Mฦฐa lแปn xแบฃy ra แป cรกc khu vแปฑc, nhแบฅt lร thร nh phแป St. Petersburg khi hแปฉng chแปu "trแบญn mฦฐa nghรฌn nฤm cรณ mแปt", vแปi lฦฐแปฃng mฦฐa trรบt xuแปng thร nh phแป trong ba giแป tฦฐฦกng ฤฦฐฦกng ba thรกng trong nฤm. Cรกc thร nh phแป McKay Creek, Clearwater Beach vร Temple Terrace cลฉng ghi nhแบญn lฦฐแปฃng mฦฐa lแปn, lแบงn lฦฐแปฃt lร 371 mm, 355 mm vร 344 mm.
====
Yรชu cแบงu cรขu trแบฃ lแปi hoแบทc lร ฤฦฐแปฃc trรญch ra tแปซ ฤoแบกn vฤn, hoแบทc lร 'NO ANSWER' nแบฟu nแปi dung ฤoแบกn vฤn khรดng liรชn quan ฤแบฟn cรขu hแปi.
Cรขu hแปi: Bรฃo Milton mแบกnh nhฦฐ thแบฟ nร o ? Diแป
n ra แป ฤรขu ?
Cรขu trแบฃ lแปi:"""
Example 3:
prompt = """Cho vฤn bแบฃn dฦฐแปi ฤรขy:
====
Bรฃo Milton bแบฏt ฤแบงu ฤแป bแป vร o Siesta Key, bang Florida, Mแปน, vแปi sแปฉc giรณ 193 km/h, tฦฐฦกng ฤฦฐฦกng cแบฅp 3 trong thang ฤo bรฃo 5 cแบฅp, vร o khoแบฃng 20h30 ngร y 9/10 (7h30 sรกng 10/10 giแป Hร Nแปi). Sau vร i tiแบฟng cร n quรฉt qua Florida, bรฃo Milton hแบก xuแปng cแบฅp 2 vร tiแบฟp tแปฅc hแบก xuแปng cแบฅp 1 vร o rแบกng sรกng 10/10.
ฤรขy lร cฦกn bรฃo thแปฉ nฤm แป Mแปน vร cฦกn bรฃo thแปฉ ba tแบฅn cรดng bang Florida trong nฤm nay. Trฦฐแปc khi bรฃo Milton ฤแป bแป, Thแปng ฤแปc Florida Ron DeSantis cho biแบฟt รญt nhแบฅt 19 cฦกn lแปc xoรกy ฤรฃ xuแบฅt hiแปn แป Florida vร 116 cแบฃnh bรกo lแปc xoรกy ฤฦฐแปฃc ban bแป khแบฏp bang.
Mฦฐa lแปn xแบฃy ra แป cรกc khu vแปฑc, nhแบฅt lร thร nh phแป St. Petersburg khi hแปฉng chแปu "trแบญn mฦฐa nghรฌn nฤm cรณ mแปt", vแปi lฦฐแปฃng mฦฐa trรบt xuแปng thร nh phแป trong ba giแป tฦฐฦกng ฤฦฐฦกng ba thรกng trong nฤm. Cรกc thร nh phแป McKay Creek, Clearwater Beach vร Temple Terrace cลฉng ghi nhแบญn lฦฐแปฃng mฦฐa lแปn, lแบงn lฦฐแปฃt lร 371 mm, 355 mm vร 344 mm.
====
TASK: ฤแบทt tiรชu ฤแป vร tรณm tแบฏt bร i bรกo trรชn thร nh 1-2 cรขu."""
Benchmarks
VMLU
We evaluate our fine-tuned models on VMLU benchmarks provided by https://vmlu.ai
| Model | VMLU | ViSquad | ViDrop | ViDialog |
|---|---|---|---|---|
| Llama3.2-1B-Instruct | 37.6 | 70.1 | 29.6 | 33.9 |
| Llama3.2-3B-Instruct | 47.6 | 90.3 | 63.5 | 50.8 |
| Qwen2.5-0.5B-Instruct | 39.1 | 62.5 | 31.5 | 28.0 |
| Qwen2.5-1.5B-Instruct | 48.6 | 86.7 | 54.5 | 39.8 |
| Qwen2.5-3B-Instruct | 52.9 | 88.3 | 72.4 | 54.4 |
Our finetuned models |
||||
| Llama3.2-1B-Instruct-KAI | 50.5 (+12.9) | 88.4 (+18.3) | 71.1 (+41.5) | 50.9 (+17.0) |
| Llama3.2-3B-Instruct-KAI | 58.1 (+10.5) | 93.5 (+3.2) | 81.4 (+17.9) | 67.3 (+16.5) |
| Qwen2.5-0.5B-Instruct-KAI | 49.7 (+10.6) | 87.3 (+24.8) | 62.3 (+30.8) | 39.0 (+11.0) |
| Qwen2.5-1.5B-Instruct-KAI | 57.5 (+8.9) | 93.3 (+6.6) | 76.0 (+21.5) | 54.6 (+14.8) |
| Qwen2.5-3B-Instruct-KAI | 63.5 (+10.6) | 94.2 (+5.9) | 80.9 (+8.5) | 68.5 (+14.1) |
Evaluate on ArenaHard (CohereForAI)
We follow the evaluation method outlined in https://github.com/lmarena/arena-hard-auto to assess our fine-tuned models against others on the ArenaHard benchmark.
- Based model:
Qwen/Qwen2.5-7B-Instruct - Judge:
Qwen/Qwen2.5-72B-Instruct
| # | model | size (B) | win | tie | lose |
|---|---|---|---|---|---|
| 1 | deepseek-ai/DeepSeek-R1-Distill-Qwen-14B | 14 | 59,5 | 4,6 | 35,9 |
| 2 | CohereForAI/aya-expanse-8b | 8 | 55 | 4,6 | 40,4 |
| 3 | Qwen/Qwen2.5-14B-Instruct | 14 | 48,7 | 9,1 | 42,2 |
| 4 | kiki-ailab/Qwen2.5-3B-Instruct-KAI | 3 | 38,7 | 4,7 | 56,6 |
| 5 | meta-llama/Llama3.1-8B-Instruct | 8 | 38,6 | 4,9 | 56,5 |
| 6 | CohereForAI/c4ai-command-r7b-12-2024 | 7 | 35,1 | 3,3 | 61,6 |
| 7 | kiki-ailab/Llama3.2-3B-Instruct-KAI | 3 | 35 | 4,3 | 60,7 |
| 8 | arcee-ai/Arcee-VyLinh | 3 | 34,8 | 5,4 | 59,8 |
| 9 | kiki-ailab/Qwen2.5-1.5B-Instruct-KAI | 1,5 | 28,9 | 3,9 | 67,2 |
| 10 | deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | 7 | 23,2 | 2,8 | 74 |
| 11 | meta-llama/Llama-3.2-3B-Instruct | 3 | 21,2 | 4,4 | 74,4 |
| 12 | Qwen/Qwen2.5-3B-Instruct | 3 | 18,6 | 5,8 | 75,6 |
| 13 | zaloai/Llama3.2-1B-Instruct-ZAI | 1 | 17,4 | 3,7 | 78,9 |
| 14 | Viet-Mistral/Vistral-7B-Chat | 7 | 17,2 | 3,2 | 79,6 |
| 15 | kiki-ailab/Qwen2.5-0.5B-Instruct-KAI | 0,5 | 10,9 | 2 | 87,1 |
| 16 | meta-llama/Llama-3.2-1B-Instruct | 1 | 6,5 | 1,6 | 91,9 |
| 17 | Qwen/Qwen2.5-1.5B-Instruct | 1 | 6,4 | 3 | 90,6 |
| 18 | deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B | 1,5 | 3 | 1,5 | 95,5 |
| 19 | vinai/PhoGPT-4B-Chat | 4 | 1,2 | 2,7 | 96,1 |
| 20 | Qwen/Qwen2.5-0.5B-Instruct | 0,5 | 1 | 1,7 | 97,3 |
Disclaimer
- Might still hallucinate on cultural-specific content.
- Primary focus on Vietnamese language understanding.
- May not perform optimally for specialized technical domains.
Benchmarks
VMLU
We evaluate our fine-tuned models on VMLU benchmarks provided by https://vmlu.ai
| Model | VMLU | ViSquad | ViDrop | ViDialog |
|---|---|---|---|---|
| Llama3.2-1B-Instruct | 37.6 | 70.1 | 29.6 | 33.9 |
| Llama3.2-3B-Instruct | 47.6 | 90.3 | 63.5 | 50.8 |
| Qwen2.5-0.5B-Instruct | 39.1 | 62.5 | 31.5 | 28.0 |
| Qwen2.5-1.5B-Instruct | 48.6 | 86.7 | 54.5 | 39.8 |
| Qwen2.5-3B-Instruct | 52.9 | 88.3 | 72.4 | 54.4 |
Our finetuned models |
||||
| Llama3.2-1B-Instruct-KAI | 50.5 (+12.9) | 88.4 (+18.3) | 71.1 (+41.5) | 50.9 (+17.0) |
| Llama3.2-3B-Instruct-KAI | 58.1 (+10.5) | 93.5 (+3.2) | 81.4 (+17.9) | 67.3 (+16.5) |
| Qwen2.5-0.5B-Instruct-KAI | 49.7 (+10.6) | 87.3 (+24.8) | 62.3 (+30.8) | 39.0 (+11.0) |
| Qwen2.5-1.5B-Instruct-KAI | 57.5 (+8.9) | 93.3 (+6.6) | 76.0 (+21.5) | 54.6 (+14.8) |
| Qwen2.5-3B-Instruct-KAI | 63.5 (+10.6) | 94.2 (+5.9) | 80.9 (+8.5) | 68.5 (+14.1) |
Evaluate on ArenaHard (CohereForAI)
We follow the evaluation method outlined in https://github.com/lmarena/arena-hard-auto to assess our fine-tuned models against others on the ArenaHard benchmark.
- Based model:
Qwen/Qwen2.5-7B-Instruct - Judge:
Qwen/Qwen2.5-72B-Instruct
| # | model | size (B) | win | tie | lose |
|---|---|---|---|---|---|
| 1 | deepseek-ai/DeepSeek-R1-Distill-Qwen-14B | 14 | 59,5 | 4,6 | 35,9 |
| 2 | CohereForAI/aya-expanse-8b | 8 | 55 | 4,6 | 40,4 |
| 3 | Qwen/Qwen2.5-14B-Instruct | 14 | 48,7 | 9,1 | 42,2 |
| 4 | kiki-ailab/Qwen2.5-3B-Instruct-KAI | 3 | 38,7 | 4,7 | 56,6 |
| 5 | meta-llama/Llama3.1-8B-Instruct | 8 | 38,6 | 4,9 | 56,5 |
| 6 | CohereForAI/c4ai-command-r7b-12-2024 | 7 | 35,1 | 3,3 | 61,6 |
| 7 | kiki-ailab/Llama3.2-3B-Instruct-KAI | 3 | 35 | 4,3 | 60,7 |
| 8 | arcee-ai/Arcee-VyLinh | 3 | 34,8 | 5,4 | 59,8 |
| 9 | kiki-ailab/Qwen2.5-1.5B-Instruct-KAI | 1,5 | 28,9 | 3,9 | 67,2 |
| 10 | deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | 7 | 23,2 | 2,8 | 74 |
| 11 | meta-llama/Llama-3.2-3B-Instruct | 3 | 21,2 | 4,4 | 74,4 |
| 12 | Qwen/Qwen2.5-3B-Instruct | 3 | 18,6 | 5,8 | 75,6 |
| 13 | zaloai/Llama3.2-1B-Instruct-ZAI | 1 | 17,4 | 3,7 | 78,9 |
| 14 | Viet-Mistral/Vistral-7B-Chat | 7 | 17,2 | 3,2 | 79,6 |
| 15 | kiki-ailab/Qwen2.5-0.5B-Instruct-KAI | 0,5 | 10,9 | 2 | 87,1 |
| 16 | meta-llama/Llama-3.2-1B-Instruct | 1 | 6,5 | 1,6 | 91,9 |
| 17 | Qwen/Qwen2.5-1.5B-Instruct | 1 | 6,4 | 3 | 90,6 |
| 18 | deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B | 1,5 | 3 | 1,5 | 95,5 |
| 19 | vinai/PhoGPT-4B-Chat | 4 | 1,2 | 2,7 | 96,1 |
| 20 | Qwen/Qwen2.5-0.5B-Instruct | 0,5 | 1 | 1,7 | 97,3 |
Disclaimer
- Might still hallucinate on cultural-specific content.
- Primary focus on Vietnamese language understanding.
- May not perform optimally for specialized technical domains.
Feedback
We welcome any feedback on these public models. Please send your comments to contact@kilm.ai.
- Downloads last month
- 4