ASID-Caption

community

Video Understanding, Audio-Visual, Multimodal LLMs, Video Captioning, Instruction Tuning, Dataset Curation, Qwen-based, Open-source, Fully-Open-MLLMs

lyhisme submitted a paper 22 days ago

lyhisme updated a model about 1 month ago

lyhisme updated a model about 1 month ago

Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions

AudioVisual-Caption 's datasets 1