WildSci: Advancing Scientific Reasoning from In-the-Wild Literature Paper • 2601.05567 • Published Jan 9
Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing Paper • 2602.04837 • Published Feb 4 • 8
Procedural Generation of Algorithm Discovery Tasks in Machine Learning Paper • 2603.17863 • Published 9 days ago • 4
What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity Paper • 2511.15593 • Published Nov 19, 2025 • 59
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published Nov 17, 2025 • 139
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models Paper • 2504.13367 • Published Apr 17, 2025 • 26
Retrieval Head Mechanistically Explains Long-Context Factuality Paper • 2404.15574 • Published Apr 24, 2024 • 3
MAF: Multi-Aspect Feedback for Improving Reasoning in Large Language Models Paper • 2310.12426 • Published Oct 19, 2023 • 1
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies Paper • 2308.03188 • Published Aug 6, 2023 • 2