11 4 12

Jonna Matthiesen

JonnaMat

AI & ML interests

None yet

Recent Activity

liked a model 5 days ago

embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

posted an update 5 days ago

⚡ FlashHead benchmarks for Llama 3.2, Gemma 3, and Qwen3 are now on https://huggingface.co/spaces/embedl/Edge-Inference-Benchmarks ! These are some of the models used in the FlashHead paper - now easier to explore and compare interactively. 🚀 Jetson AGX Thor (tok/s, batch=1): - Llama-3.2-1B: 77 → 285 (FlashHead+W4A16, 3.7x) - Llama-3.2-3B: 34 → 112 (3.3x) - Gemma-3-1B: 79 → 153 (1.9x) - Qwen3-1.7B: 49 → 189 (3.8x) - Qwen3-0.6B: 140 → 177 (1.3x) ✅ Accuracy matches baseline on MMLU-Pro, IFEval, BBH, TruthfulQA, GSM8K.

updated a model 5 days ago

embedl/Qwen3-1.7B-FlashHead-W4A16

View all activity

Organizations

liked a model 5 days ago

embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

Image-Text-to-Text • 2B • Updated 5 days ago • 1.09k • 7

posted an update 5 days ago

Post

⚡ FlashHead benchmarks for Llama 3.2, Gemma 3, and Qwen3 are now on embedl/Edge-Inference-Benchmarks !
These are some of the models used in the FlashHead paper - now easier to explore and compare interactively.

🚀 Jetson AGX Thor (tok/s, batch=1):
- Llama-3.2-1B: 77 → 285 (FlashHead+W4A16, 3.7x)
- Llama-3.2-3B: 34 → 112 (3.3x)
- Gemma-3-1B: 79 → 153 (1.9x)
- Qwen3-1.7B: 49 → 189 (3.8x)
- Qwen3-0.6B: 140 → 177 (1.3x)

✅ Accuracy matches baseline on MMLU-Pro, IFEval, BBH, TruthfulQA, GSM8K.

updated 11 models 5 days ago

updated a dataset 6 days ago

embedl/documentation-images

Viewer • Updated 3 days ago • 12 • 1.59k

updated a Space 6 days ago

Edge Inference Benchmarks

🚀

On-Device benchmarks across devices and models.

Jonna Matthiesen

AI & ML interests

Recent Activity

Organizations

JonnaMat's activity

Edge Inference Benchmarks