Video Understanding, Audio-Visual, Multimodal LLMs, Video Captioning, Instruction Tuning, Dataset Curation, Qwen-based, Open-source, Fully-Open-MLLMs
Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions