But yeah, modeling performance-wise you'll probably take a hit. Same thing that happened when they distilled ChatGPT GPT-3.5 and later GPT-4
Model Distillation Performance Trade-offs in Large Language Models
By
–
By
–
But yeah, modeling performance-wise you'll probably take a hit. Same thing that happened when they distilled ChatGPT GPT-3.5 and later GPT-4