Llama 3's paper is full of cool insights. Here's how they filtered out bad instruction samples. – Quality is evaluated by a reward model and an LLM-as-a-judge
– Complexity is a combination of Instag (
https://
arxiv.org/abs/2308.07074) and LLM-as-a-judge + It looks like models are merged
Llama 3 Instruction Filtering: Quality and Complexity Evaluation
By
–
Leave a Reply