view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 β’ 298
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? Paper β’ 2502.14502 β’ Published Feb 20, 2025 β’ 91
Running 3.69k The Ultra-Scale Playbook π 3.69k The ultimate guide to training LLM on large GPU Clusters
view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google +1 Feb 19, 2025 β’ 75
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper β’ 2409.01704 β’ Published Sep 3, 2024 β’ 83
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models Paper β’ 2407.15841 β’ Published Jul 22, 2024 β’ 40
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper β’ 2402.14905 β’ Published Feb 22, 2024 β’ 134 β’ 13
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper β’ 2402.14905 β’ Published Feb 22, 2024 β’ 134