QED-Nano: Teaching a Tiny Model to Prove Hard Theorems
๐
74
Who needs 1T parameters? Olympiad proofs with a 4B model
Visualize on-policy distillation for any model family
Evaluate multilingual models using FineTasks
The secrets to building world-class LLMs