inference-optimization/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head 8B • Updated Dec 11, 2025 • 2