Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization Paper • 2604.09574 • Published Feb 24 • 27
StepORLM: A Self-Evolving Framework With Generative Process Supervision For Operations Research Language Models Paper • 2509.22558 • Published Sep 26, 2025 • 4
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering Paper • 2604.08224 • Published 8 days ago • 49
StepORLM: A Self-Evolving Framework With Generative Process Supervision For Operations Research Language Models Paper • 2509.22558 • Published Sep 26, 2025 • 4
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering Paper • 2604.08224 • Published 8 days ago • 49
PhotoBench: Beyond Visual Matching Towards Personalized Intent-Driven Photo Retrieval Paper • 2603.01493 • Published Mar 2 • 20
PhotoBench: Beyond Visual Matching Towards Personalized Intent-Driven Photo Retrieval Paper • 2603.01493 • Published Mar 2 • 20