Zhihan Liu

ZHLiu627

AI & ML interests

LLMs

Recent Activity

upvoted a paper about 1 month ago

Paying Less Generalization Tax: A Cross-Domain Generalization Study of RL Training for LLM Agents

upvoted a paper 2 months ago

Agent Learning via Early Experience

upvoted a paper 4 months ago

Scaling Agent Learning via Experience Synthesis

View all activity

Organizations

None yet

ZHLiu627 's datasets 20

ZHLiu627/updated_qwen2.5_code_1.5b_grpo_iter0_full_data_miao_0212__self_correction_iter1_v1

Viewer • Updated Feb 27, 2025 • 29.3k • 5

ZHLiu627/dataset_qwen2.5_code_1.5b_grpo_iter0_full_data_miao_0212_2_global_step_70filtered_v1_v1

Viewer • Updated Feb 27, 2025 • 29.3k • 5 • 1

ZHLiu627/dataset_qwen2.5_code_1.5b_grpo_iter0_full_data_miao_0212_2_global_step_70filtered_v1

Viewer • Updated Feb 22, 2025 • 29.3k • 5

ZHLiu627/updated-code-qwen7-edufiltered

Viewer • Updated Feb 21, 2025 • 43k • 4

ZHLiu627/updated-code-qwen7-edu

Viewer • Updated Feb 21, 2025 • 75.6k • 10

ZHLiu627/updated_qwen2.5_code_1.5b_grpo_iter0_full_data_miao_0212__self_correction_iter1_v2filtered

Viewer • Updated Feb 19, 2025 • 28.9k • 3

ZHLiu627/updated_qwen2.5_code_1.5b_grpo_iter0_full_data_miao_0212__self_correction_iter1_v2

Viewer • Updated Feb 19, 2025 • 29.3k • 5

ZHLiu627/dataset_qwen2.5_code_1.5b_grpo_iter0_full_data_miao_0212_2_global_step_70filteredd

Viewer • Updated Feb 19, 2025 • 29.3k • 3

ZHLiu627/updated_qwen2.5_code_1.5b_grpo_iter0_full_data_miao_0212__self_correction_iter1_v1filtered

Viewer • Updated Feb 19, 2025 • 29.1k • 3

ZHLiu627/qwen2.5_code_1.5b_grpo_iter0_full_data_miao_0212__self_correction_iter1_v2

Viewer • Updated Feb 18, 2025 • 29.3k • 3

ZHLiu627/qwen2.5_code_1.5b_grpo_iter0_full_data_miao_0212__self_correction_iter1_v1

Viewer • Updated Feb 18, 2025 • 29.3k • 3

ZHLiu627/code-opc2-edu

Viewer • Updated Feb 8, 2025 • 118k • 4

ZHLiu627/ultrafeedback_binarized_with_response_full

Viewer • Updated Mar 8, 2024 • 61.1k • 6

ZHLiu627/ultrafeedback_binarized_with_response_full_part2

Viewer • Updated Mar 8, 2024 • 21.1k • 4

ZHLiu627/ultrafeedback_binarized_with_response_full_part1

Viewer • Updated Mar 8, 2024 • 20k • 6 • 1

ZHLiu627/ultrafeedback_binarized_with_response_full_part0

Viewer • Updated Mar 7, 2024 • 20k • 4

Zhihan Liu

AI & ML interests

Recent Activity

Organizations

ZHLiu627 's datasets 20 Sort: Recently updated

ZHLiu627 's datasets 20