6 18 14

Jiawei Liu

ganler

https://jw-liu.xyz/

AI & ML interests

Simplifying the making of great software.

Recent Activity

upvoted a paper 4 months ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

upvoted an article 5 months ago

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

published a dataset 7 months ago

purpcode/ctxdistill-verified-ablation-Qwen2.5-14B-Instruct-1M-73k

View all activity

Organizations

upvoted a paper 4 months ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9, 2025 • 39

upvoted an article 5 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Feb 7, 2025

•

276

published 3 datasets 7 months ago

updated 2 datasets 7 months ago

purpcode/ctxdistill-verified-Qwen2.5-14B-Instruct-1M-57k

Viewer • Updated Aug 9, 2025 • 57.7k • 41

purpcode/ctxdistill-verified-Qwen2.5-32B-Instruct-55k

Viewer • Updated Aug 9, 2025 • 55.6k • 29

updated a Space 7 months ago

README

🦀

updated a collection 7 months ago

Paper

Collection

1 item • Updated Aug 5, 2025

updated a dataset 7 months ago

purpcode/ctxdistill-verified-ablation-Qwen2.5-14B-Instruct-1M-73k

Viewer • Updated Aug 5, 2025 • 74k • 8

updated a collection 7 months ago

PurpCode Models

Collection

4 items • Updated Aug 5, 2025

published a Space 7 months ago

README

🦀

published 2 models 7 months ago

purpcode/purpcode-14b-rule-sft

Text Generation • 15B • Updated Jul 31, 2025 • 3

purpcode/purpcode-32b-rule-sft

Text Generation • 33B • Updated Jul 31, 2025 • 4

updated 2 models 7 months ago

purpcode/purpcode-32b-rule-sft

Text Generation • 33B • Updated Jul 31, 2025 • 4

purpcode/purpcode-14b-rule-sft

Text Generation • 15B • Updated Jul 31, 2025 • 3

published a model 7 months ago

purpcode/purpcode-32b-rl

Text Generation • 33B • Updated Jul 31, 2025 • 24

Jiawei Liu

AI & ML interests

Recent Activity

Organizations

ganler's activity

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

README

README