2 23 28

Yiping Wang

ypwang61

https://ypwang61.github.io/

AI & ML interests

machine learning

Recent Activity

upvoted a paper 8 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

upvoted a paper 14 days ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

upvoted a paper 28 days ago

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

View all activity

Organizations

None yet

upvoted a paper 8 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published 9 days ago • 83

upvoted a paper 14 days ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published 15 days ago • 54

upvoted a paper 28 days ago

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Paper • 2511.07317 • Published 29 days ago • 13

upvoted an article about 2 months ago

Article

OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models

Jul 18

•

upvoted 2 papers 2 months ago

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1 • 89

EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees

Paper • 2503.08893 • Published Mar 11 • 6

upvoted a collection 3 months ago

RecA

Collection

Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning! • 8 items • Updated Sep 22 • 13

upvoted 3 papers 3 months ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1 • 75

rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28 • 116

upvoted 2 papers 4 months ago

Deep Think with Confidence

Paper • 2508.15260 • Published Aug 21 • 88

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

Paper • 2507.19457 • Published Jul 25 • 28

upvoted an article 5 months ago

Article

Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models

Jul 10

•

upvoted a collection 6 months ago

Spurious Rewards

Collection

Spurious Rewards: Rethinking Training Signals in RLVR • 14 items • Updated Jun 13 • 2

upvoted a paper 6 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 263

upvoted 3 papers 7 months ago

MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation

Paper • 2505.17613 • Published May 23 • 8

Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models

Paper • 2505.17015 • Published May 22 • 9

Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models

Paper • 2505.14810 • Published May 20 • 62

upvoted a collection 7 months ago

One-Shot RLVR

Collection

Collections of models and papers for works: "Reinforcement Learning for Reasoning in Large Language Models with One Training Example" • 24 items • Updated Nov 3 • 1

upvoted a paper 7 months ago

ReasonIR: Training Retrievers for Reasoning Tasks

Paper • 2504.20595 • Published Apr 29 • 53

Yiping Wang

AI & ML interests

Recent Activity

Organizations

ypwang61's activity

OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models

Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models