SenseNova-MARS: Empowering Multimodal Agentic Reasoning and Search via Reinforcement Learning Paper • 2512.24330 • Published 24 days ago • 35
sensenova/SenseNova-SI-1.2-InternVL3-8B Image-Text-to-Text • 8B • Updated Dec 10, 2025 • 3.84k • 10
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published Dec 22, 2025 • 64
Scaling Spatial Intelligence with Multimodal Foundation Models Paper • 2511.13719 • Published Nov 17, 2025 • 47
NEO1_0 Collection From Pixels to Words -- Towards Native Vision-Language Primitives at Scale • 7 items • Updated Oct 17, 2025 • 7
SenseNova-SI Collection Scaling Spatial Intelligence with Multimodal Foundation Models • 10 items • Updated 12 days ago • 15
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published Nov 14, 2025 • 186
Running on CPU Upgrade Featured 2.91k The Smol Training Playbook 📚 2.91k The secrets to building world-class LLMs
Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning Paper • 2510.11027 • Published Oct 13, 2025 • 22
VR-Thinker: Boosting Video Reward Models through Thinking-with-Image Reasoning Paper • 2510.10518 • Published Oct 12, 2025 • 19
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published Oct 13, 2025 • 166
From Pixels to Words -- Towards Native Vision-Language Primitives at Scale Paper • 2510.14979 • Published Oct 16, 2025 • 67
CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving Paper • 2510.07944 • Published Oct 9, 2025 • 25
InteractiveOmni: A Unified Omni-modal Model for Audio-Visual Multi-turn Dialogue Paper • 2510.13747 • Published Oct 15, 2025 • 30
NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints Paper • 2510.08565 • Published Oct 9, 2025 • 20
Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published Oct 6, 2025 • 119
BaseReward: A Strong Baseline for Multimodal Reward Model Paper • 2509.16127 • Published Sep 19, 2025 • 21