perf: maddubs kernel + nrc=4 multi-row for Q1_0_g128 (3.5-3.75 t/s) 570ff77 verified OpenTransformer commited on about 5 hours ago
perf: optimized AVX2 kernel + COM6-inspired matmul dispatch (0.2 -> 3.43 t/s) 165bcc5 verified OpenTransformer commited on 10 days ago
perf: optimized AVX2 kernel + COM6-inspired matmul dispatch (0.2 -> 3.43 t/s) 8f4b822 verified OpenTransformer commited on 10 days ago
Q1_0_g128 CPU kernel fix + AVX2 SIMD (fork of PrismML-Eng/llama.cpp) 9d7f7ca verified OpenTransformer commited on 10 days ago
Q1_0_g128 CPU kernel fix + AVX2 SIMD (fork of PrismML-Eng/llama.cpp) 79d8a89 verified OpenTransformer commited on 10 days ago
Q1_0_g128 CPU kernel fix + AVX2 SIMD (fork of PrismML-Eng/llama.cpp) c7a53cd verified OpenTransformer commited on 10 days ago
Q1_0_g128 CPU kernel fix + AVX2 SIMD (fork of PrismML-Eng/llama.cpp) 03ba2cd verified OpenTransformer commited on 10 days ago