A self-distillation based training method for long context reasoning in a single LLM without reinforcement learning
Purbesh Mitra
purbeshmitra
AI & ML interests
Emergent reasoning in AI systems
Recent Activity
updated
a dataset
5 days ago
purbeshmitra/ssb_teacher_data
updated
a model
5 days ago
purbeshmitra/semantic-soft-bootstrapping
updated
a collection
5 days ago
Semantic Soft Bootstrapping