Yale-ROSE/Qwen3-4B-dimacs_cube-sft_gpt-oss-120b-dpo_gpt-oss-120b_reasoning_grpo-v2
Text Generation
•
4B
•
Updated
•
6
Automated Reasoning, Reinforcement Learning, Neuro-Symbolic AI