Self Distillation - Search News

Why Al Models Forget & How MIT Fixed It With Knowledge Retention

MIT introduces Self-Distillation Fine-Tuning to reduce catastrophic forgetting; it uses student-teacher demonstrations and needs 2.5x compute.

Computerworld

Researchers propose a self-distillation fix for ‘catastrophic forgetting’ in LLMs

LLMs tend to lose prior skills when fine-tuned for new tasks. A new self-distillation approach aims to reduce regression and simplify model management. A new fine-tuning technique aims to solve ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Why Al Models Forget & How MIT Fixed It With Knowledge Retention

Researchers propose a self-distillation fix for ‘catastrophic forgetting’ in LLMs

Trending now