I think this is more about the task than the models. This is also the same for flan palm 62b and even 540b (except for BBH IIRC). That said, cot + self consistency is usually better than direct prompting! I go over this a little in the blogpost.
Chain-of-Thought and Self-Consistency Beat Direct Prompting
By
–
Leave a Reply