Pretrained models from the paper "Predicting the Order of Upcoming Tokens Improves Language Modeling"
Zayd Muhammad Kawakibi Zuhri PRO
zaydzuhri
AI & ML interests
I really like watching loss go down
Recent Activity
updated
a model
about 1 hour ago
zaydzuhri/top-1B-ratio090-4096-batch8x2-steps200000-20251209-082436
published
a model
1 day ago
zaydzuhri/top-1B-ratio090-4096-batch8x2-steps200000-20251209-082436
updated
a model
3 days ago
zaydzuhri/top-1B-ratio090-4096-batch8x2-steps200000-20251208-112022
Organizations
None yet