SmolVLM: Redefining small and efficient multimodal models Paper β’ 2504.05299 β’ Published Apr 7, 2025 β’ 202
Vision Language Models Quantization Collection Vision Language Models (VLMs) quantized by Neural Magic β’ 20 items β’ Updated Mar 4, 2025 β’ 6
MambaVision Collection MambaVision: A Hybrid Mamba-Transformer Vision Backbone. Includes both 1K and 21K pretrained models. β’ 13 items β’ Updated 13 days ago β’ 34
MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs β’ 9 items β’ Updated 13 days ago β’ 23
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12, 2025 β’ 480