Better to let the model think. It's sort of like how we used to manually set "temperature" two years ago — nowadays it's better to let the model decide.
@bcherny
-

Phasing Out Flawed Evaluation with System Card Caveat
By
–
This is a bad eval that we've been phasing out. Going to add a caveat to the system card to make it clear. More here:
-
Subscriber Token Limits Increased
By
–
We've increased limits for all subscribers to make up for the increased token use. Enjoy!
-

MRCR Phase-Out: Shifting from Distractor-Based to Applied Long-Context
By
–
We kept MRCR in the system card for scientific honesty, but we've actually been phasing it out slowly. Two reasons: (1) it's built around stacking distractors to trick the model, which isn't how people actually use long context, and (2) we care more about applied long-context
-

Phasing Out MRCR: Shifting Focus from Distraction Tricks to Applied Long Context
By
–
We kept MRCR in the system card for scientific honesty, but we've actually been phasing it out slowly. Two reasons: (1) it's built around stacking distractors to trick the model, which isn't how people actually use long context, and (2) we care more about applied
-
Opus 4.7 Intelligence Improvements and Effective Usage Tips
By
–
Opus 4.7 feels more intelligent, agentic, and precise than 4.6. It took a few days for me to learn how to work with it effectively, to fully take advantage of its new capabilities. Will post a few more tips throughout the day, starting with this blog post:
-

Phasing Out MRCR: Rethinking Long Context Evaluation Methods
By
–
We kept MRCR in the system card for scientific honesty, but we've actually been phasing it out slowly. Two reasons: (1) it's built around stacking distractors to trick the model, which isn't how people actually use long context, and (2) we care more about applied
-

MRCR Phased Out: Focus Shifts to Applied Long-Context
By
–
We kept MRCR in the system card for scientific honesty, but we've actually been phasing it out slowly. Two reasons: (1) it's built around stacking distractors to trick the model, which isn't how people actually use long context, and (2) we care more about applied long-context
-
Rate limits adjusted for improved xhigh tier usage
By
–
We’ve tuned rate limits to give you the same amount of usage with xhigh