Collection of Quantized Models for MoE
Krishna Teja Chitty-Venkata
AI & ML interests
LLM Optimization, Neural Architecture Search, Quantization, Pruning
Recent Activity
updated
a model
1 day ago
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_6.5-bits
updated
a model
1 day ago
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_6.25-bits
updated
a model
1 day ago
inference-optimization/Meta-Llama-3.1-8B-Instruct-NVFP4-FP8-Dynamic_6.0-bits