FlowSlider: Training-Free Continuous Image Editing via Fidelity-Steering Decomposition Paper • 2604.02088 • Published 3 days ago • 5
UniDriveVLA: Unifying Understanding, Perception, and Action Planning for Autonomous Driving Paper • 2604.02190 • Published 3 days ago • 17
GPA: Learning GUI Process Automation from Demonstrations Paper • 2604.01676 • Published 3 days ago • 9
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook Paper • 2604.02029 • Published 3 days ago • 117
QuitoBench: A High-Quality Open Time Series Forecasting Benchmark Paper • 2603.26017 • Published 9 days ago • 29
HippoCamp: Benchmarking Contextual Agents on Personal Computers Paper • 2604.01221 • Published 4 days ago • 25
Think, Act, Build: An Agentic Framework with Vision Language Models for Zero-Shot 3D Visual Grounding Paper • 2604.00528 • Published 4 days ago • 7
Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published 4 days ago • 24
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers Paper • 2603.24414 • Published 11 days ago • 175
MMFace-DiT: A Dual-Stream Diffusion Transformer for High-Fidelity Multimodal Face Generation Paper • 2603.29029 • Published 5 days ago • 13
CutClaw: Agentic Hours-Long Video Editing via Music Synchronization Paper • 2603.29664 • Published 5 days ago • 44
Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells Paper • 2603.25240 • Published 10 days ago • 75
Learn2Fold: Structured Origami Generation with World Model Planning Paper • 2603.29585 • Published Feb 2 • 15
LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published 7 days ago • 132
On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers Paper • 2603.28762 • Published 6 days ago • 24
Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis Paper • 2603.29620 • Published 5 days ago • 45