OpenGVLab

community

https://github.com/opengvlab

opengvlab

OpenGVLab

Activity Feed Request to join this org

AI & ML interests

Computer Vision

Recent Activity

wzk1015 authored a paper 11 days ago

GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing

yangxue authored a paper 11 days ago

GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing

yangxue submitted a paper 11 days ago

GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing

View all activity

Papers

RIVER: A Real-Time Interaction Benchmark for Video LLMs

InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision

View all Papers

prithivMLmods

posted an update 3 days ago

Post

3601

Map-Anything v1 (Universal Feed-Forward Metric 3D Reconstruction) demo is now available on Hugging Face Spaces. Built with Gradio and integrated with Rerun, it performs multi-image and video-based 3D reconstruction, depth, normal map, and interactive measurements.

🤗 Demo: prithivMLmods/Map-Anything-v1
🤗 Model: facebook/map-anything-v1
🤗 Hf-Papers: MapAnything: Universal Feed-Forward Metric 3D Reconstruction (2509.13414)

prithivMLmods

posted an update 6 days ago

Post

3005

Introducing QIE-Bbox-Studio! 🔥🤗

The QIE-Bbox-Studio demo is now live — more precise and packed with more options. Users can manipulate images with object removal, design addition, and even move objects from one place to another, all in just 4-step fast inference.

🤗 Demo: prithivMLmods/QIE-Bbox-Studio
🔗 GitHub: https://github.com/PRITHIVSAKTHIUR/QIE-Bbox-Studio

🚀 Models [LoRA] :

● QIE-2511-Object-Mover-Bbox: prithivMLmods/QIE-2511-Object-Mover-Bbox
● QIE-2511-Object-Remover-Bbox-v3: prithivMLmods/QIE-2511-Object-Remover-Bbox-v3
● QIE-2511-Outfit-Design-Layout: prithivMLmods/QIE-2511-Outfit-Design-Layout
● QIE-2509-Object-Remover-Bbox-v3: prithivMLmods/QIE-2509-Object-Remover-Bbox-v3
● QIE-2509-Object-Mover-Bbox: prithivMLmods/QIE-2509-Object-Mover-Bbox

🚀 Collection:

● Qwen Image Edit [Layout Bbox]: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-layout-bbox

To learn more, visit the app page or the respective model pages.

Nymbo

posted an update 8 days ago

Post

6195

We should really have a release date range slider on the /models page. Tired of "trending/most downloaded" being the best way to sort and still seeing models from 2023 on the first page just because they're embedded in enterprise pipelines and get downloaded repeatedly. "Recently Created/Recently Updated" don't solve the discovery problem considering the amount of noise to sift through.

Slight caveat: Trending actually does have some recency bias, but it's not strong/precise enough.

3 replies

prithivMLmods

posted an update 9 days ago

Post

5000

QIE-2509-Object-Remover-Bbox-v3 is a more stable version of the Qwen Image Edit visual grounding–based object removal model. The app was previously featured in HF Spaces of the Week and is now updated with the latest Bbox-v3 LoRA adapter.

🤗 Demo: prithivMLmods/QIE-Object-Remover-Bbox
🤗 LoRA: prithivMLmods/QIE-2509-Object-Remover-Bbox-v3
🤗 Collection: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-layout-bbox

To learn more, visit the app page or the respective model pages.

2 replies

wzk1015

authored a paper 11 days ago

GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing

Paper • 2603.12264 • Published 11 days ago • 14

yangxue

authored a paper 11 days ago

GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing

Paper • 2603.12264 • Published 11 days ago • 14

yangxue

submitted a paper to Daily Papers 11 days ago

GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing

Paper • 2603.12264 • Published 11 days ago • 14

Xrenya

in OpenGVLab/InternVideo2-Stage2_1B-224p-f4 12 days ago

Error when using model

#2 opened about 2 months ago by

wardaslab

wzk1015

authored a paper 13 days ago

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Paper • 2603.09877 • Published 13 days ago • 47

cuierfei

authored a paper 13 days ago

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Paper • 2603.09877 • Published 13 days ago • 47

Rayment

authored a paper 13 days ago

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Paper • 2603.09877 • Published 13 days ago • 47

ganlinyang

authored a paper 13 days ago

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Paper • 2603.09877 • Published 13 days ago • 47

prithivMLmods

posted an update 17 days ago

Post

4997

The Qwen3.5 Multimodal Understanding Demo, powered by Qwen3.5-2B, is now available on HF Spaces! It is a lightweight model designed for fast image and video reasoning. Built with Gradio, the demo showcases Image QA, Video QA, object detection, and 2D point tracking, along with real-time token streaming.

🤗 Demo: prithivMLmods/Qwen-3.5-HF-Demo
✅ Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
🔗 Qwen3.5-2B: Qwen/Qwen3.5-2B

To learn more, visit the app page or the respective model pages.