Multimodal - a mapuna Collection

mapuna 's Collections

Multimodal

updated 12 days ago

ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems

Paper • 2503.20756 • Published Mar 26, 2025 • 7
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Paper • 2505.09568 • Published May 14, 2025 • 99
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25, 2025 • 217
Qwen3-Omni Technical Report

Paper • 2509.17765 • Published Sep 22, 2025 • 152
CaptionQA: Is Your Caption as Useful as the Image Itself?

Paper • 2511.21025 • Published Nov 26, 2025 • 28
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

Paper • 2602.12099 • Published Feb 12 • 61
PyVision-RL: Forging Open Agentic Vision Models via RL

Paper • 2602.20739 • Published Feb 24 • 31
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data

Paper • 2603.09206 • Published 22 days ago • 52
Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding

Paper • 2603.13366 • Published 22 days ago • 94