AI Dynamics

Global AI News Aggregator

Request for Public Benchmarks to Evaluate Claude Code System Prompts

@AnthropicAI Are there any public benchmarks I can run when developing a system prompt for Claude Code? I'm an ML person so it's fine if it's a repo with a bunch of steps etc. It's hard to know what's working just by feel, because tasks are always different.

→ View original post on X — @honnibal,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *