Running on CPU Upgrade Featured 2.98k The Smol Training Playbook 📚 2.98k The secrets to building world-class LLMs
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms Paper • 2511.04217 • Published Nov 6, 2025 • 17 • 4
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms Paper • 2511.04217 • Published Nov 6, 2025 • 17
Attention Is All You Need for KV Cache in Diffusion LLMs Paper • 2510.14973 • Published Oct 16, 2025 • 42
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13, 2025 • 180
EmbeddingGemma: Powerful and Lightweight Text Representations Paper • 2509.20354 • Published Sep 24, 2025 • 47
Video models are zero-shot learners and reasoners Paper • 2509.20328 • Published Sep 24, 2025 • 100 • 5
Lost in Embeddings: Information Loss in Vision-Language Models Paper • 2509.11986 • Published Sep 15, 2025 • 29
Color Me Correctly: Bridging Perceptual Color Spaces and Text Embeddings for Improved Diffusion Generation Paper • 2509.10058 • Published Sep 12, 2025 • 12
view article Article How to Choose the Best Open Source LLM for Your Project in 2025 Sep 9, 2025 • 75