Spaces:

mknolan
/

cursor_slides_internvl2

Paused

cursor_slides_internvl2 / README-HF.md

Upload InternVL2 implementation

e59dc66 verified 9 months ago

899 Bytes

	# Image Description with Qwen2-VL-7B

	This Hugging Face Space uses the powerful Qwen2-VL-7B vision language model to generate detailed descriptions of images.

	## About

	Upload any image and get:
	- A basic description
	- A detailed analysis
	- A technical assessment

	The app uses the Qwen2-VL-7B model with 4-bit quantization to provide efficient and high-quality image analysis.

	## Usage

	1. Upload an image or use one of the example images
	2. Click "Analyze Image"
	3. View the three types of descriptions generated by the model

	## Examples

	The space includes sample images in the data_temp folder that you can use to test the model.

	## Technical Details

	- Model: Qwen2-VL-7B
	- Framework: Gradio UI + Flask API backend
	- Quantization: 4-bit for efficient inference
	- GPU: A10G recommended

	## Credits

	- [Qwen2-VL-7B model](https://huggingface.co/Qwen/Qwen2-VL-7B) by Qwen team