Update README.md
Browse files
README.md
CHANGED
|
@@ -19,7 +19,7 @@ base_model:
|
|
| 19 |
`Unified-Reward-Think-7b` is the first unified multimodal CoT reward model, capable of multi-dimensional, step-by-step long-chain reasoning for both visual understanding and generation reward tasks.
|
| 20 |
|
| 21 |
For further details, please refer to the following resources:
|
| 22 |
-
|
| 23 |
- πͺ Project Page: https://codegoat24.github.io/UnifiedReward/think
|
| 24 |
- π€ Model Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-models-67c3008148c3a380d15ac63a
|
| 25 |
- π€ Dataset Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-training-data-67c300d4fd5eff00fa7f1ede
|
|
@@ -112,10 +112,10 @@ print(text_outputs[0])
|
|
| 112 |
## Citation
|
| 113 |
|
| 114 |
```
|
| 115 |
-
@article{UnifiedReward,
|
| 116 |
title={Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning.},
|
| 117 |
-
author={Wang, Yibin and Li, Zhimin and Zang, Yuhang and Wang, Chunyu and Lu, Qinglin and Jin, Cheng and Wang, Jiaqi},
|
| 118 |
-
journal={arXiv preprint arXiv:},
|
| 119 |
year={2025}
|
| 120 |
}
|
| 121 |
```
|
|
|
|
| 19 |
`Unified-Reward-Think-7b` is the first unified multimodal CoT reward model, capable of multi-dimensional, step-by-step long-chain reasoning for both visual understanding and generation reward tasks.
|
| 20 |
|
| 21 |
For further details, please refer to the following resources:
|
| 22 |
+
- π° Paper: https://arxiv.org/pdf/2505.03318
|
| 23 |
- πͺ Project Page: https://codegoat24.github.io/UnifiedReward/think
|
| 24 |
- π€ Model Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-models-67c3008148c3a380d15ac63a
|
| 25 |
- π€ Dataset Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-training-data-67c300d4fd5eff00fa7f1ede
|
|
|
|
| 112 |
## Citation
|
| 113 |
|
| 114 |
```
|
| 115 |
+
@article{UnifiedReward-Think,
|
| 116 |
title={Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning.},
|
| 117 |
+
author={Wang, Yibin and Li, Zhimin and Zang, Yuhang and Wang, Chunyu and Lu, Qinglin, and Jin, Cheng and Wang, Jiaqi},
|
| 118 |
+
journal={arXiv preprint arXiv:2505.03318},
|
| 119 |
year={2025}
|
| 120 |
}
|
| 121 |
```
|