ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents Paper • 2603.18815 • Published 2 days ago • 6
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 2 days ago • 41
AdaMem: Adaptive User-Centric Memory for Long-Horizon Dialogue Agents Paper • 2603.16496 • Published 4 days ago • 11
Look Before Acting: Enhancing Vision Foundation Representations for Vision-Language-Action Models Paper • 2603.15618 • Published 5 days ago • 20
Alignment Makes Language Models Normative, Not Descriptive Paper • 2603.17218 • Published 4 days ago • 39
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published 4 days ago • 115
Autonomous Agents Coordinating Distributed Discovery Through Emergent Artifact Exchange Paper • 2603.14312 • Published 6 days ago • 5
Spend Less, Reason Better: Budget-Aware Value Tree Search for LLM Agents Paper • 2603.12634 • Published 9 days ago • 6
Attention Sinks Are Provably Necessary in Softmax Transformers: Evidence from Trigger-Conditional Tasks Paper • 2603.11487 • Published 10 days ago • 2