2/3 For fairness: The DeBERTa-1.5B model was likely finetuned on the training data whereas Llama 2 was used via few-shot prompting. In that case, it highlights once more that finetuning custom LLMs remains worthwhile.
Fine-tuning Custom LLMs vs Few-Shot Prompting Performance Comparison
By
–
Leave a Reply