One Model to Critique Them All: Rewarding Agentic Tool-Use via Efficient Reasoning
-
One Model to Critique Them All: Rewarding Agentic Tool-Use via Efficient Reasoning
Paper • 2510.26167 • Published • 1 -
RioLee/ToolRM-Qwen3-4B-Thinking-2507
Text Generation • 4B • Updated • 7 -
RioLee/ToolPref-Pairwise-30K
Viewer • Updated • 60k • 66 • 2 -
RioLee/TRBench-BFCL
Viewer • Updated • 11.9k • 23 • 1