agurung/flawed-fictions-qwen3-4b-lengthpenalty-litereason Reinforcement Learning • 4B • Updated 3 days ago • 84
agurung/flawed-fictions-gemma-3-4b-lengthpenalty Reinforcement Learning • 4B • Updated 16 days ago • 63
agurung/flawed-fictions-qwen3-4b-lengthpenalty Reinforcement Learning • 4B • Updated 17 days ago • 61
agurung/flawed-fictions-qwen25-7b-lengthpenalty-litereason Reinforcement Learning • 8B • Updated 20 days ago • 77
agurung/flawed-fictions-qwen25-7b-lengthpenalty Reinforcement Learning • 8B • Updated 21 days ago • 196
agurung/v4_savebestearly_sft_qwen7B_25percent_lr_1e3_bptt_offset Text Generation • 8B • Updated Feb 5 • 1
agurung/v4_savebestearly_sft_qwen7B_25percent_lr_1e4_bptt_offset Text Generation • 8B • Updated Feb 5 • 1
agurung/v1ff_savebestearly_sft_qwen7B_25percent_lr_1e4_bptt_offset Text Generation • 8B • Updated Feb 5 • 4
agurung/v2ff_savebestearly_sft_qwen7B_25percent_lr_1e4_bptt_offset Text Generation • 8B • Updated Feb 5 • 5
agurung/v3ff_savebestearly_sft_qwen7B_25percent_lr_1e4_bptt_offset_newprompt Text Generation • 8B • Updated Feb 5 • 1