Barney Greenway's picture

Barney Greenway

McG-221

·

AI & ML interests

LLMs on Apple Silicon

Recent Activity

reacted to nightmedia's post with 🚀 7 minutes ago

Qwen3.5 Performance Metrics With the 3.5 architecture, a lot of the old quanting methods don't work as before. I noticed this when benchmarking Deckard(qx) quants and by mistake ran a q8 that was better. That only happens if the qx sucked--and it did--enhancing layers just because they look interesting doesn't work anymore, so until I get a clear understanding of the architecture, I will publish mxfp4 and mxfp8 of the 3.5 models, that seem very stable and high performant I will start posting here the metrics I gather from the series, starting with the smallest. If I have numbers from previous or similar models, I will post them in comparison Qwen3.5-0.8B ```brainwaves quant arc arc/e boolq hswag obkqa piqa wino mxfp8 0.351,0.501,0.733,0.462,0.348,0.682,0.573 mxfp4 0.339,0.489,0.738,0.433,0.330,0.672,0.553 Old model performance Qwen3-0.6B bf16 0.298,0.354,0.378,0.415,0.344,0.649,0.534 q8-hi 0.296,0.355,0.378,0.416,0.348,0.652,0.529 q8 0.299,0.354,0.378,0.414,0.346,0.650,0.535 q6-hi 0.301,0.356,0.378,0.415,0.350,0.651,0.541 q6 0.300,0.367,0.378,0.416,0.344,0.647,0.524 mxfp4 0.286,0.364,0.609,0.404,0.316,0.626,0.531 Quant Perplexity Peak memory mxfp8 6.611 ± 0.049 7.65 GB mxfp4 7.455 ± 0.057 6.33 GB ``` Detailed metrics by model https://huggingface.co/nightmedia/Qwen3.5-0.8B-mxfp8-mlx https://huggingface.co/nightmedia/Qwen3.5-2B-mxfp8-mlx https://huggingface.co/nightmedia/Qwen3.5-4B-mxfp8-mlx https://huggingface.co/nightmedia/Qwen3.5-9B-mxfp8-mlx https://huggingface.co/nightmedia/Qwen3.5-27B-Text https://huggingface.co/nightmedia/Qwen3.5-122B-A10B-Text-mxfp4-mlx More metrics coming soon. I am running these on my Mac, an M4Max with 128GB RAM. Some performance numbers like tokens/second reflect the performance on my box. This post will be updated with every model that gets tested. The larger models take hours, the 27B a couple days, so it will be a long process. -G

updated a model 9 minutes ago

McG-221/Sketch-Cydonia-mlx-8Bit

published a model 9 minutes ago

McG-221/Sketch-Cydonia-mlx-8Bit

View all activity

Organizations

Collections 1

models 99

McG-221/Sketch-Cydonia-mlx-8Bit

Text Generation • 24B • Updated 7 minutes ago

McG-221/Magistry-24B-v1.0-mlx-8Bit

24B • Updated 2 days ago • 83 • 1

McG-221/Qwen3-Next-80B-A3B-Instruct-REAM-mlx-6Bit

Text Generation • 60B • Updated 2 days ago • 117 • 1

McG-221/Magistaroth-24B-v1.1-mlx-8Bit

Text Generation • 24B • Updated 2 days ago • 51

McG-221/Gemma3-27B-it-vl-Polaris-HI16-Heretic-Uncensored-INSTRUCT-mlx-8Bit

Image-Text-to-Text • 27B • Updated 2 days ago • 264

McG-221/Qwen3.5-27B-abliterated-mlx-gs32

Text Generation • 27B • Updated 4 days ago • 744 • 1

McG-221/Qwen3.5-REAP-97B-A10B-mlx-3_6

Text Generation • 97B • Updated 4 days ago • 364 • 1

McG-221/Rotor_24B_V.1-mlx-8Bit

Text Generation • 24B • Updated 5 days ago • 37

McG-221/GLM-4.7-Flash-heretic-1.2.0-mxfp8

Text Generation • 30B • Updated 6 days ago • 203 • 1

McG-221/Qwen3.5-27B-heretic-mxfp8

Text Generation • 27B • Updated 6 days ago • 369

datasets 0

None public yet