AI Dynamics

Global AI News Aggregator

About

Reconsider AI Evaluation Metrics for Code Generation Capabilities

if you consider autonomously writing 800LOC of C code from very simple instructions as "didn't work" you may wish to reconsider the nonlinearity of your evals. be well.

→ View original post on X — @swyx