In our error analysis, we found that most models continued to struggle with: – Final Answer Missing Information
– Hallucinated Information This indicates a widespread issue with precise instruction following, especially in prompts requiring detailed information and intermediary
Models Struggle with Instruction Following and Hallucinated Information
By
–
