-
UNDO: Understanding Distillation as Optimization
Paper • 2504.02521 • Published -
One Model to Train them All: Hierarchical Self-Distillation for Enhanced Early Layer Embeddings
Paper • 2503.03008 • Published • 1 -
Understanding Self-Distillation in the Presence of Label Noise
Paper • 2301.13304 • Published -
How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks
Paper • 2407.03475 • Published
Keira Chen
KeiraYC
AI & ML interests
None yet
Recent Activity
updated
a collection
24 days ago
Self-distillation
updated
a collection
24 days ago
Self-distillation
updated
a collection
24 days ago
Self-distillation
Organizations
None yet