Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers Paper • 2601.17367 • Published 25 days ago • 33
baidu/ERNIE-4.5-VL-28B-A3B-Thinking Image-Text-to-Text • 30B • Updated about 24 hours ago • 1.08k • 520