AI Dynamics

Global AI News Aggregator

About

Self-Distillation Fine-Tuning: On-Policy Continual Learning from Expert Demonstrations

2026 is the year of continual learning And we are getting some amazing papers towards that This paper introduces Self-Distillation Fine-Tuning (SDFT): on-policy continual learning from expert demonstrations, with no explicit reward inference or engineering The trick here is:

→ View original post on X — @askalphaxiv