Attention computation

by Serpient - opened 4 days ago

4 days ago

When i am trying out, i notice there are three specified ways in the model, flex attention, sdpa and eager. It seems eager is supposed to be the default option? But as i try to generate using the demo, i find that the default is sdpa, and when i set config._attn_implementation to eager, the generation output becomes gibberish.

utdawn

inclusionAI org 4 days ago

Thanks for the feedback. We'll look into it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment