VLM with textual-driven GRPO training for vision-grounded decision making (https://arxiv.org/pdf/2503.16965, NeurIPS 2025)
Derek Zhe Hu
zhehuderek
AI & ML interests
NLP, Multimodality
Recent Activity
updated
a model
1 day ago
zhehuderek/qwen25_vl_7b_guru_run3_step90
published
a model
1 day ago
zhehuderek/qwen25_vl_7b_guru_run3_step90
updated
a model
2 days ago
zhehuderek/qwen2.5-3b-divser-arggen-step1500
Organizations
None yet
YesBut
The collections of visual humor understanding and comparative reasoning.
-
zhehuderek/YESBUT_Benchmark
Viewer • Updated • 348 • 36 • 1 -
zhehuderek/YESBUT_Benchmark_V2
Viewer • Updated • 1.26k • 67 • 1 -
Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
Paper • 2405.19088 • Published -
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
Paper • 2503.23137 • Published • 2
Praxis-VLM
VLM with textual-driven GRPO training for vision-grounded decision making (https://arxiv.org/pdf/2503.16965, NeurIPS 2025)
YesBut
The collections of visual humor understanding and comparative reasoning.
-
zhehuderek/YESBUT_Benchmark
Viewer • Updated • 348 • 36 • 1 -
zhehuderek/YESBUT_Benchmark_V2
Viewer • Updated • 1.26k • 67 • 1 -
Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
Paper • 2405.19088 • Published -
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
Paper • 2503.23137 • Published • 2
models
8
zhehuderek/qwen25_vl_7b_guru_run3_step90
Image-to-Text
•
8B
•
Updated
•
14
zhehuderek/qwen2.5-3b-divser-arggen-step1500
Text Generation
•
3B
•
Updated
•
15
zhehuderek/qwen2_5_vl_7b_GEOQA_8K_step90_hf
Image-to-Text
•
8B
•
Updated
•
6
zhehuderek/praxis_vlm_7b_decisionmaking
Image-to-Text
•
8B
•
Updated
•
44
zhehuderek/praxis_vlm_3b_decisionmaking
Image-to-Text
•
4B
•
Updated
•
3
zhehuderek/qwen2_5_vl_3b_GEOQA_8K_hf
Image-to-Text
•
4B
•
Updated
•
5
zhehuderek/llama-2-7b-chinese
Text Generation
•
7B
•
Updated
•
7
zhehuderek/llama-3.1-8b-chinese-sft
Text Generation
•
8B
•
Updated
•
5
datasets
11
zhehuderek/processed_guru-RL-92k
Viewer
•
Updated
•
72.3k
•
10
zhehuderek/VIVA_Plus_Benchmark
Viewer
•
Updated
•
6.37k
•
66
zhehuderek/OpenThoughts3-1.2M-processed
Viewer
•
Updated
•
39.6k
•
19
zhehuderek/humor_understanding_combined
Viewer
•
Updated
•
4.89k
•
27
•
1
zhehuderek/humor_understanding_nyt
Viewer
•
Updated
•
2.69k
•
15
zhehuderek/comparative_reasoning_mllm_compbench
Viewer
•
Updated
•
21.8k
•
12
zhehuderek/humor_understanding_deepeval
Viewer
•
Updated
•
2.96k
•
19
zhehuderek/textual_decisionmaking_data
Viewer
•
Updated
•
11k
•
23
•
1
zhehuderek/YESBUT_Benchmark_V2
Viewer
•
Updated
•
1.26k
•
67
•
1
zhehuderek/YESBUT_Benchmark
Viewer
•
Updated
•
348
•
36
•
1