Extending Reinforcement Learning for LLMs with Flow Environment
SII-Jhao Zhang
JingHaoZ
AI & ML interests
Large Reasoning Model, Unified Understanding and Generation in MLLM
Recent Activity
upvoted a paper about 3 hours ago
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization upvoted a paper about 2 months ago
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing updated a dataset 5 months ago
JingHaoZ/RLFR-Dataset-LM