AI Dynamics

Global AI News Aggregator

@bcherny

Claude API Usage Policy and Overages Now Clearly Defined

By

@bcherny

–

04 April 2026 22h26

Yep API is fully supported, as well as overages for Claude logins. This was always the case in our terms and docs, but since it was recommended in some other products’ 3p docs, a lot of folks didn’t realize it’s not allowed. Hoping the new way is less footgunny for people.

→ View original post on X — @bcherny,

4 April 2026
Scaling High Throughput AI Inference Infrastructure Challenges

By

@bcherny

–

04 April 2026 8h40

Sometimes I take for granted how quickly we can ship great product, vs how hard it is to tune a super high throughput inference + api stack. The scale makes the latter really hard. we’re working around the clock to make it better.

→ View original post on X — @bcherny,

4 April 2026
Default Settings and Token Usage in AI Systems

By

@bcherny

–

04 April 2026 8h09

Everyone gets the same default, and it’s sticky when you change it. The only setting that isn’t sticky across sessions is effort=max, because it can use a lot of tokens

→ View original post on X — @bcherny,

4 April 2026
Claude Code Effort Levels Impact Model Performance Differently

By

@bcherny

–

04 April 2026 8h01

This is false. We serve exactly the same models to all users. What the person in the post might be experiencing is a lower effort level vs. what the enterprise set. Claude Code users can change this anytime by running /effort. low effort = less tokens and lower intelligence,

→ View original post on X — @bcherny,

4 April 2026
Subscription optimization for AI usage patterns at scale

By

@bcherny

–

04 April 2026 1h48

It's not about tokens, it's about our subscriptions being optimized for specific usage patterns. Lots of tradeoffs in building for such large scale, and one of them is optimizing systems for certain use cases and not others

→ View original post on X — @bcherny,

4 April 2026
Improving Prompt Cache Efficiency for API Users

By

@bcherny

–

04 April 2026 1h45

I put up a few PRs to improve prompt cache efficiency actually, to benefit folks using it through API/overages

→ View original post on X — @bcherny,

4 April 2026
Open Source Contributions Improve Prompt Cache Efficiency

By

@bcherny

–

04 April 2026 1h43

We're big fans of open source. I actually just put up a few PRs to improve prompt cache efficiency for OpenClaw specifically. This is more about engineering constraints. Our systems are highly optimized for one kind of workload, and to serve as many people as possible with the

→ View original post on X — @bcherny,

4 April 2026
Engineering Tradeoffs: Subscription Model Optimization Strategy

By

@bcherny

–

04 April 2026 1h39

I know it sucks. Fundamentally engineering is about tradeoffs, and one of the things we do to serve a lot of customers is optimize the way subscriptions work to serve as many people as possible with the best model. Third party services are not optimized in this way, so it's

→ View original post on X — @bcherny,

4 April 2026
Claude’s Current Capabilities and Limitations

By

@bcherny

–

01 April 2026 23h41

Claude can do this today, just ask

→ View original post on X — @bcherny,

1 April 2026
CPU RSS optimization constant performance transcript processing

By

@bcherny

–

01 April 2026 23h32

It should feel significantly better. CPU/RSS are now constant, rather than growing with O(transcript length)

→ View original post on X — @bcherny,

1 April 2026