Running on CPU Upgrade 208 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 208 Explore synthetic data experiments as an interactive bookshelf
Running on CPU Upgrade Featured 3.06k The Smol Training Playbook 📚 3.06k The secrets to building world-class LLMs
view article Article Ulysses Sequence Parallelism: Training with Million-Token Contexts 16 days ago • 23
Running Featured 69 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 69 Who needs 1T parameters? Olympiad proofs with a 4B model
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper • 2510.14528 • Published Oct 16, 2025 • 122
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA +3 May 24, 2023 • 176
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 Dec 18, 2025 • 123
📝 Research & Long-Form Blog Posts Collection In-depth technical articles and research pieces published by Hugging Face • 11 items • Updated Feb 16 • 21
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2512.20848 • Published Dec 23, 2025 • 41