FlowScene: Style-Consistent Indoor Scene Generation with Multimodal Graph Rectified Flow Paper • 2603.19598 • Published 8 days ago • 32
Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models Paper • 2603.18002 • Published 10 days ago • 13
The Trinity of Consistency as a Defining Principle for General World Models Paper • 2602.23152 • Published 30 days ago • 200
DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models Paper • 2512.15713 • Published Dec 17, 2025 • 18
Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO Paper • 2602.06422 • Published Feb 6 • 46
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published Jan 30 • 110
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding Paper • 2512.13586 • Published Dec 15, 2025 • 93
HOIverse: A Synthetic Scene Graph Dataset With Human Object Interactions Paper • 2506.19639 • Published Jun 24, 2025 • 2
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published Oct 7, 2025 • 55
Robix: A Unified Model for Robot Interaction, Reasoning and Planning Paper • 2509.01106 • Published Sep 1, 2025 • 52