Barney Greenway

McG-221

AI & ML interests

LLMs on Apple Silicon

Recent Activity

reacted to nightmedia's post with 🚀 7 minutes ago
Qwen3.5 Performance Metrics With the 3.5 architecture, a lot of the old quanting methods don't work as before. I noticed this when benchmarking Deckard(qx) quants and by mistake ran a q8 that was better. That only happens if the qx sucked--and it did--enhancing layers just because they look interesting doesn't work anymore, so until I get a clear understanding of the architecture, I will publish mxfp4 and mxfp8 of the 3.5 models, that seem very stable and high performant I will start posting here the metrics I gather from the series, starting with the smallest. If I have numbers from previous or similar models, I will post them in comparison Qwen3.5-0.8B ```brainwaves quant arc arc/e boolq hswag obkqa piqa wino mxfp8 0.351,0.501,0.733,0.462,0.348,0.682,0.573 mxfp4 0.339,0.489,0.738,0.433,0.330,0.672,0.553 Old model performance Qwen3-0.6B bf16 0.298,0.354,0.378,0.415,0.344,0.649,0.534 q8-hi 0.296,0.355,0.378,0.416,0.348,0.652,0.529 q8 0.299,0.354,0.378,0.414,0.346,0.650,0.535 q6-hi 0.301,0.356,0.378,0.415,0.350,0.651,0.541 q6 0.300,0.367,0.378,0.416,0.344,0.647,0.524 mxfp4 0.286,0.364,0.609,0.404,0.316,0.626,0.531 Quant Perplexity Peak memory mxfp8 6.611 ± 0.049 7.65 GB mxfp4 7.455 ± 0.057 6.33 GB ``` Detailed metrics by model https://huggingface.co/nightmedia/Qwen3.5-0.8B-mxfp8-mlx https://huggingface.co/nightmedia/Qwen3.5-2B-mxfp8-mlx https://huggingface.co/nightmedia/Qwen3.5-4B-mxfp8-mlx https://huggingface.co/nightmedia/Qwen3.5-9B-mxfp8-mlx https://huggingface.co/nightmedia/Qwen3.5-27B-Text https://huggingface.co/nightmedia/Qwen3.5-122B-A10B-Text-mxfp4-mlx More metrics coming soon. I am running these on my Mac, an M4Max with 128GB RAM. Some performance numbers like tokens/second reflect the performance on my box. This post will be updated with every model that gets tested. The larger models take hours, the 27B a couple days, so it will be a long process. -G
updated a model 9 minutes ago
McG-221/Sketch-Cydonia-mlx-8Bit
published a model 9 minutes ago
McG-221/Sketch-Cydonia-mlx-8Bit
View all activity

Organizations

MLX Community's profile picture