Paper: A UNIVERSITY-LEVEL BENCHMARK FOR EVALUATING MATHEMATICAL SKILLS IN LLMS
Toloka
company
Verified
AI & ML interests
Human-expert data for frontier reasoning, safety and agentic AI
Recent Activity
Organization Card
Hey, this is Toloka!
models 4
toloka/prompts_reward_model
Text Classification • 82.1M • Updated • 18
toloka/gpt2-large-supervised-prompt-writing
Text Generation • 0.8B • Updated • 1.01k
toloka/gpt2-large-rl-prompt-writing
Text Generation • 0.8B • Updated • 7 • 3
toloka/t5-large-for-text-aggregation
Summarization • Updated • 8 • 7
datasets 13
toloka/HomER
Viewer • Updated • 63 • 23
toloka/mu-math
Viewer • Updated • 1.08k • 13 • 24
toloka/u-math
Viewer • Updated • 1.1k • 207 • 26
toloka/vist
Viewer • Updated • 39.3k • 151
toloka/VOX-DUB
Viewer • Updated • 7.58k • 146 • 11
toloka/JEEM
Viewer • Updated • 2.2k • 43 • 14
toloka/beemo
Viewer • Updated • 2.19k • 310 • 19
toloka/CLESC
Viewer • Updated • 500 • 7 • 2
toloka/VoxDIY-RusNews
Updated • 60 • 3
toloka/CrowdSpeech
Updated • 75 • 5