lihaoxin2020/qwen3-4b-refiner-gpt54-rubric-v3-2-rl-lr5e-6-step100 Text Generation • 196k • Updated 2 days ago • 176
lihaoxin2020/qwen3-4b-refiner-gpt54-rubric-v3-2-rl-lr5e-6-step50 Text Generation • 196k • Updated 2 days ago • 135
lihaoxin2020/qwen3-4b-refiner-gpt54-instance-rubric-gpt54-grpo-step50 Text Generation • 196k • Updated 4 days ago • 313
lihaoxin2020/qwen3-4B-refiner-sft-rl-balanced-resume-step100 Text Generation • 196k • Updated 9 days ago • 189
lihaoxin2020/qwen3-4B-refiner-sft-rl-balanced-step50 Text Generation • 196k • Updated 11 days ago • 202
lihaoxin2020/qwen3-4B-refiner-3201-rl-balanced-step100 Text Generation • 196k • Updated 11 days ago • 150
lihaoxin2020/qwen3-4B-refiner-3201-rl-balanced-step50 Text Generation • 196k • Updated 11 days ago • 12 • 1
lihaoxin2020/qwen-insturct-synthetic_1-sft-sciriff-grpo Text Generation • 8B • Updated Mar 31, 2025 • 5