MAVIS: A Benchmark for Multimodal Source Attribution in Long-form Visual Question Answering Paper • 2511.12142 • Published Nov 15, 2025
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 345k • 1.57k
Read-only Prompt Optimization for Vision-Language Few-shot Learning Paper • 2308.14960 • Published Aug 29, 2023 • 3
Is a Peeled Apple Still Red? Evaluating LLMs' Ability for Conceptual Combination with Property Type Paper • 2502.06086 • Published Feb 10, 2025 • 1