mit-han-lab
/

svdq-int4-flux.1-depth-dev

FLUX.1-Depth-dev

Model card Files Files and versions

Lmxyy commited on Feb 11

Commit

269aa86

·

verified ·

1 Parent(s): b7ae6b5

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -37,7 +37,7 @@ library_name: diffusers
   <a href='https://hanlab.mit.edu/blog/svdquant'>[Blog]</a>
 </div>
-![teaser](https://github.com/mit-han-lab/nunchaku/blob/main/app/flux.1/depth_canny/assets/demo.jpg)
 SVDQuant is a post-training quantization technique for 4-bit weights and activations that well maintains visual fidelity. On 12B FLUX.1-dev, it achieves 3.6× memory reduction compared to the BF16 model. By eliminating CPU offloading, it offers 8.7× speedup over the 16-bit model when on a 16GB laptop 4090 GPU, 3× faster than the NF4 W4A16 baseline. On PixArt-∑, it demonstrates significantly superior visual quality over other W4A4 or even W4A8 baselines. "E2E" means the end-to-end latency including the text encoder and VAE decoder.
 ## Method

   <a href='https://hanlab.mit.edu/blog/svdquant'>[Blog]</a>
 </div>
+![teaser](https://raw.githubusercontent.com/mit-han-lab/nunchaku/refs/heads/main/app/flux.1/depth_canny/assets/demo.jpg)
 SVDQuant is a post-training quantization technique for 4-bit weights and activations that well maintains visual fidelity. On 12B FLUX.1-dev, it achieves 3.6× memory reduction compared to the BF16 model. By eliminating CPU offloading, it offers 8.7× speedup over the 16-bit model when on a 16GB laptop 4090 GPU, 3× faster than the NF4 W4A16 baseline. On PixArt-∑, it demonstrates significantly superior visual quality over other W4A4 or even W4A8 baselines. "E2E" means the end-to-end latency including the text encoder and VAE decoder.
 ## Method