TRIP-Bench: A Benchmark for Long-Horizon Interactive Agents in Real-World Scenarios Paper • 2602.01675 • Published 5 days ago • 9
Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction Paper • 2601.05107 • Published 30 days ago • 24
RECAST: Expanding the Boundaries of LLMs' Complex Instruction Following with Multi-Constraint Data Paper • 2505.19030 • Published May 25, 2025 • 1
TRIP-Bench: A Benchmark for Long-Horizon Interactive Agents in Real-World Scenarios Paper • 2602.01675 • Published 5 days ago • 9