train_codealpacapy_456_1765209082

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4487
  • Num Input Tokens Seen: 24973864

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.509 1.0 1908 0.4619 1246832
0.5189 2.0 3816 0.4605 2497936
0.4566 3.0 5724 0.4579 3743760
0.4249 4.0 7632 0.4500 4991472
1.0288 5.0 9540 0.4567 6239608
0.4128 6.0 11448 0.4487 7485248
0.6169 7.0 13356 0.4506 8733024
0.6135 8.0 15264 0.4515 9983720
0.3322 9.0 17172 0.4538 11229792
0.37 10.0 19080 0.4565 12476552
0.3672 11.0 20988 0.4624 13725560
0.416 12.0 22896 0.4656 14977976
0.3909 13.0 24804 0.4721 16225896
0.4212 14.0 26712 0.4780 17477224
0.3323 15.0 28620 0.4943 18726216
0.2485 16.0 30528 0.5011 19973408
0.334 17.0 32436 0.5096 21226656
0.327 18.0 34344 0.5130 22472696
0.3437 19.0 36252 0.5145 23722376
0.5016 20.0 38160 0.5146 24973864

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
71
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_456_1765209082

Adapter
(2084)
this model

Evaluation results