Improve model card for VisionReasoner-7B (Seg-Zero framework)
#2
by
nielsr
HF Staff
- opened
This PR significantly improves the model card for Ricky06662/VisionReasoner-7B by:
- Updating the
pipeline_tagtoimage-segmentationto accurately reflect the model's core task of generating pixel-level masks, enhancing its discoverability on the Hugging Face Hub. - Clarifying paper references by highlighting its foundation in the primary paper, "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement", while also linking to "VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning".
- Correcting code and project page links to the official
Seg-ZeroGitHub repository:https://github.com/dvlab-research/Seg-Zero. - Expanding the model description with detailed insights from the paper abstract and GitHub README, covering its decoupled architecture, reinforcement learning training, zero-shot generalization capabilities, and performance highlights.
- Adding additional datasets (
Ricky06662/refCOCOg_9k_840,Ricky06662/VisionReasoner_multi_object_7k_840) to the metadata for better completeness. - Embedding key visuals (overview, pipeline, examples) directly from the GitHub README to provide a richer visual context.
- Including a comprehensive citation section with BibTeX entries for both associated papers.
These changes ensure the model card is more informative, accurate, and discoverable for researchers and users.
Ricky06662
changed pull request status to
merged