Finetune `nvidia/nemotron-ocr-v1` recognition model
#7
by
johnlockejrr
- opened
Is it possible to finetune the nvidia/nemotron-ocr-v1 recognition model on new data/languages? Is there any training code released? Thank you!
Ok. Wrote a finetuning script from scratch.
(nemotron-ocr-v1) incognito@DESKTOP-H1BS9PO:~/nemotron-ocr-v1$ python train_recognizer_detector_fixed.py --train_file heb_synth_pangoline-xml_train.json --val_file heb_synth_pangoline-xml_val.json --image_dir heb_synth_pangoline-xml --model_dir checkpoints --output_dir checkpoints_hebrew_fixed --epochs 50 --learning_rate 1e-4 --weight_decay 1e-4 --log_dir runs/hebrew_fixed --patience 8
INFO:__main__:Loaded 63633 pages from heb_synth_pangoline-xml_train.json
INFO:__main__:Loaded 7071 pages from heb_synth_pangoline-xml_val.json
Epoch 1 [train]: 100%|βββββββββββββββββββββββββββββββββββββββββββββ| 63633/63633 [3:00:23<00:00, 5.88it/s, loss=0.8506]
Epoch 1 [val]: 100%|ββββββββββββββββββββββββββββββββββββββββ| 7071/7071 [23:10<00:00, 5.08it/s, cer=0.5669, wer=0.8741]
INFO:__main__:Epoch 1: train_loss=1.3973, val_loss=1.9529, val_cer=0.5858, val_wer=0.8783
INFO:__main__:New best CER 0.5858 β saving best model
Epoch 2 [train]: 100%|βββββββββββββββββββββββββββββββββββββββββββββ| 63633/63633 [3:03:52<00:00, 5.77it/s, loss=0.8970]
Epoch 2 [val]: 100%|ββββββββββββββββββββββββββββββββββββββββ| 7071/7071 [23:21<00:00, 5.05it/s, cer=0.5641, wer=0.8813]
INFO:__main__:Epoch 2: train_loss=0.7032, val_loss=2.0183, val_cer=0.5598, val_wer=0.8109
INFO:__main__:New best CER 0.5598 β saving best model
Epoch 3 [train]: 100%|βββββββββββββββββββββββββββββββββββββββββββββ| 63633/63633 [3:04:01<00:00, 5.76it/s, loss=0.1227]
Epoch 3 [val]: 100%|ββββββββββββββββββββββββββββββββββββββββ| 7071/7071 [23:20<00:00, 5.05it/s, cer=0.5555, wer=0.8669]
INFO:__main__:Epoch 3: train_loss=0.4435, val_loss=2.3166, val_cer=0.5485, val_wer=0.7858
INFO:__main__:New best CER 0.5485 β saving best model
Epoch 4 [train]: 100%|βββββββββββββββββββββββββββββββββββββββββββββ| 63633/63633 [3:04:13<00:00, 5.76it/s, loss=0.3979]
Epoch 4 [val]: 100%|ββββββββββββββββββββββββββββββββββββββββ| 7071/7071 [23:24<00:00, 5.03it/s, cer=0.5583, wer=0.8489]
INFO:__main__:Epoch 4: train_loss=0.3155, val_loss=2.4229, val_cer=0.5412, val_wer=0.7653
INFO:__main__:New best CER 0.5412 β saving best model
Epoch 5 [train]: 100%|βββββββββββββββββββββββββββββββββββββββββββββ| 63633/63633 [3:07:39<00:00, 5.65it/s, loss=0.1290]
Epoch 5 [val]: 100%|ββββββββββββββββββββββββββββββββββββββββ| 7071/7071 [23:54<00:00, 4.93it/s, cer=0.5512, wer=0.8525]
INFO:__main__:Epoch 5: train_loss=0.2401, val_loss=1.9812, val_cer=0.5375, val_wer=0.7490
INFO:__main__:New best CER 0.5375 β saving best model
...
Epoch 12 [train]: 100%|ββββββββββββββββββββββββββββββββββββββββββββ| 63633/63633 [3:09:34<00:00, 5.59it/s, loss=0.0143]
Epoch 12 [val]: 100%|βββββββββββββββββββββββββββββββββββββββ| 7071/7071 [24:22<00:00, 4.84it/s, cer=0.4681, wer=0.6547]
INFO:__main__:Epoch 12: train_loss=0.0805, val_loss=1.8426, val_cer=0.5065, val_wer=0.6866
INFO:__main__:New best CER 0.5065 β saving best model
Epoch 13 [train]: 29%|βββββββββββββ | 18437/63633 [55:19<2:18:51, 5.42it/s, loss=0.1313]
Still training.