prithivMLmods/chandra-OCR-GGUF · Chandra-ocr inference

Chandra-ocr inference

by gaspachoto - opened 6 days ago

6 days ago

Hello guys, I am currently testing chandra-ocr as discribed here in the demo - https://huggingface.co/spaces/prithivMLmods/Multimodal-OCR3/blob/main/app.py and I am really impressed of the quality of the model on ID documents in bulgarian langauge (which is a difficult if not impossible task for many OCR models - e.g I did the same tests with DeepseekOCR). The thing is that im testing it on a L4 VM and the best time i achieved on a single image is 27s which will not work for my project. I stumbled upon these GGUF versions but i couldn't make them work with llama.cpp and i think the main issue is that Qwen3_vl is not supported yet. So I wonder if anyone was able to test them and if so how and should I expect it to run faster with a lower quantization or it will just lower VRAM? I really hope that I will be able to lower the inf time because its the best ocr model so far and i have tried many. Thanks :)

prithivMLmods

Owner 6 days ago

Hello @gaspachoto ,
Llama.cpp supports Qwen3-VL. Please check your version, upgrade, and try again.

Thank you.

prithivMLmods changed discussion status to closed 5 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment