floyed shen's picture

floyed shen

floyed

·

AI & ML interests

None yet

Recent Activity

commented on a paper 5 days ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

upvoted a paper 9 days ago

dLLM: Simple Diffusion Language Modeling

upvoted a paper 11 days ago

Endless Terminals: Scaling RL Environments for Terminal Agents

View all activity

Organizations

commented a paper 5 days ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published 28 days ago • 216 •

commented a paper 15 days ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published 28 days ago • 216 •

New activity in Beijing-AISI/panda-bench 10 months ago

Upload benchmarks.zip

#2 opened 10 months ago by