I think that's exactly what's happening but for cost reasons rather than accuracy reasons. I think long-context needle-in-the-haystack issues have improved a lot last year since all major LLMs now have a dedicated long-context finetuning stage in pre/post-training.
Long-context needle-in-the-haystack improvements driven by cost optimization
By
–
Leave a Reply