Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Text Generation • 28B • Updated 1 day ago • 260 • 24
ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning Paper • 2602.21534 • Published 4 days ago • 22
PyVision-RL: Forging Open Agentic Vision Models via RL Paper • 2602.20739 • Published 5 days ago • 28
EgoPush: Learning End-to-End Egocentric Multi-Object Rearrangement for Mobile Robots Paper • 2602.18071 • Published 9 days ago • 22
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 10 days ago • 469
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories Paper • 2602.10809 • Published 18 days ago • 52
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models Paper • 2602.12036 • Published 17 days ago • 98
Internalizing Meta-Experience into Memory for Guided Reinforcement Learning in Large Language Models Paper • 2602.10224 • Published 19 days ago • 19
view article Article The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+ 26 days ago • 50
The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation Paper • 2601.17737 • Published Jan 25 • 55
view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective Jan 27 • 59