Okay, distillation is broken. Nvidia's width pruning + distillation shows just how effective distillation is for healing networks. This is unlocking a ton of applications: better quantization, model merging, abliteration, etc. We might finally see some real model editing.
Distillation Unlocks New Applications for Model Optimization
By
–
