Many people say that Llama 3.1 is distilled, but this is incorrect: it's just synthetic data. Here's a plot from kindacognizant (
https://
reddit.com/r/LocalLLaMA/c
omments/1ed58iu/llama31_models_are_fake_distillations_this_should/
…) showing the difference in probability distributions. Yeah, it's NOT EVEN distilled. Still more performance to tap into.
Llama 3.1 Uses Synthetic Data Not Distillation Says Expert
By
–
Leave a Reply