Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models Paper • 2603.25750 • Published 11 days ago • 8
Know3D: Prompting 3D Generation with Knowledge from Vision-Language Models Paper • 2603.22782 • Published 7 days ago • 8
MedOpenClaw: Auditable Medical Imaging Agents Reasoning over Uncurated Full Studies Paper • 2603.24649 • Published 5 days ago • 20
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference Paper • 2603.25730 • Published 4 days ago • 38
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling Paper • 2603.25746 • Published 4 days ago • 110
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models Paper • 2603.25716 • Published 4 days ago • 133
Pixel-level Scene Understanding in One Token: Visual States Need What-is-Where Composition Paper • 2603.13904 • Published 16 days ago • 3
Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes Paper • 2603.25562 • Published 4 days ago • 5
BioVITA: Biological Dataset, Model, and Benchmark for Visual-Textual-Acoustic Alignment Paper • 2603.23883 • Published 6 days ago • 4
AVO: Agentic Variation Operators for Autonomous Evolutionary Search Paper • 2603.24517 • Published 5 days ago • 6
Reaching Beyond the Mode: RL for Distributional Reasoning in Language Models Paper • 2603.24844 • Published 5 days ago • 7
MemMA: Coordinating the Memory Cycle through Multi-Agent Reasoning and In-Situ Self-Evolution Paper • 2603.18718 • Published 12 days ago • 8
MuRF: Unlocking the Multi-Scale Potential of Vision Foundation Models Paper • 2603.25744 • Published 4 days ago • 9
AVControl: Efficient Framework for Training Audio-Visual Controls Paper • 2603.24793 • Published 5 days ago • 21
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data Paper • 2603.25319 • Published 4 days ago • 30
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens Paper • 2603.23516 • Published 25 days ago • 39