DPO RLLab/allenai-Dolci-Instruct-DPO-Length-Filtered Viewer • Updated Mar 1 • 146k • 15 RLLab/olmo-3-7b-it-sft Text Generation • 7B • Updated Dec 18, 2025 • 19 allenai/Dolci-Instruct-SFT-No-Tools Viewer • Updated Jan 5 • 1.92M • 196 • 4 RLLab/gemma-3-4b-text-sft Text Generation • 4B • Updated Feb 28 • 7
RL-Dataset open-r1/DAPO-Math-17k-Processed Viewer • Updated Nov 10, 2025 • 34.8k • 5.34k • 62 DigitalLearningGmbH/MATH-lighteval Viewer • Updated Jan 15, 2025 • 25k • 31.4k • 65 POLARIS-Project/Polaris-Dataset-53K Viewer • Updated Jun 18, 2025 • 53.3k • 967 • 36 RLLab/math-rl Viewer • Updated Nov 25, 2025 • 57.5k • 10
DPO RLLab/allenai-Dolci-Instruct-DPO-Length-Filtered Viewer • Updated Mar 1 • 146k • 15 RLLab/olmo-3-7b-it-sft Text Generation • 7B • Updated Dec 18, 2025 • 19 allenai/Dolci-Instruct-SFT-No-Tools Viewer • Updated Jan 5 • 1.92M • 196 • 4 RLLab/gemma-3-4b-text-sft Text Generation • 4B • Updated Feb 28 • 7
RL-Dataset open-r1/DAPO-Math-17k-Processed Viewer • Updated Nov 10, 2025 • 34.8k • 5.34k • 62 DigitalLearningGmbH/MATH-lighteval Viewer • Updated Jan 15, 2025 • 25k • 31.4k • 65 POLARIS-Project/Polaris-Dataset-53K Viewer • Updated Jun 18, 2025 • 53.3k • 967 • 36 RLLab/math-rl Viewer • Updated Nov 25, 2025 • 57.5k • 10