Extract and process text from images with custom prompts
Domain-Enhanced Universal Vision-Language Models