-
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Paper • 2203.05482 • Published • 7 -
Diverse Weight Averaging for Out-of-Distribution Generalization
Paper • 2205.09739 • Published • 1 -
Fusing finetuned models for better pretraining
Paper • 2204.03044 • Published • 6 -
Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs
Paper • 2309.07311 • Published • 4
Niels Horn
nilq
AI & ML interests
Natural language understanding, synthetic emotional speech, mechanistic interpretability.
Organizations
Dynamics of Transformer Language Model Features
-
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Paper • 2203.05482 • Published • 7 -
Diverse Weight Averaging for Out-of-Distribution Generalization
Paper • 2205.09739 • Published • 1 -
Fusing finetuned models for better pretraining
Paper • 2204.03044 • Published • 6 -
Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs
Paper • 2309.07311 • Published • 4
Toy Models to Study
models 16
nilq/baby-python-mistral-1L-tiny-TinyStories-ft
Text Generation • 35.1M • Updated
• 6 • 1
nilq/baby-python-mistral-1L-tiny-lua-ft
Text Generation • 35.1M • Updated
• 10
nilq/baby-python-1L-mistral-lua-stories-slerp
Text Generation • 35.1M • Updated
• 9
nilq/baby-python-mistral-1L-tiny-base
Text Generation • 35.1M • Updated
• 9
nilq/lua-stories-slerp-mistral-1L-tiny
Text Generation • 35.1M • Updated
• 5
nilq/lua-stories-slerp-mistral-2L-tiny
Text Generation • 37.5M • Updated
• 2
nilq/mistral-2L-tiny
Text Generation • 37.5M • Updated
• 4
nilq/lua-stories-linear-mistral-1L-tiny
Text Generation • 35.1M • Updated
• 7
nilq/python-mistral-1L-mini
Text Generation • 4.13M • Updated
• 8
nilq/mistral-1L-tiny
Text Generation • 35.1M • Updated
• 25 • 6
datasets 9
nilq/baby-python-and-tiny-stories-and-lua
Viewer
• Updated
• 12.3M • 8
nilq/baby-python-and-lua
Viewer
• Updated
• 12.3M • 19 • 1
nilq/baby-python-and-tiny-stories
Viewer
• Updated
• 13.9M • 12
nilq/python-and-tiny-stories
Updated
• 6
nilq/baby-python
Viewer
• Updated
• 11.7M • 17 • 1
nilq/small-lua-stack
Viewer
• Updated
• 559k • 41 • 2
nilq/small-python-stack
Viewer
• Updated
• 2.59M • 129
nilq/babylm-100M
Viewer
• Updated
• 12.7M • 37
nilq/babylm-10M
Viewer
• Updated
• 3.14M • 159 • 1