AI Dynamics

Global AI News Aggregator

LLM Judge Rejects Functional Fix for Code Aesthetics

3/5 An example: In instance psf__requests-1724, the gold fix is 2 lines. Our agent’s functional fix was 8 lines. The LLM judge rejected the correct 8-liner as "messy" and "redundant," choosing a clean but **non-functional** fix instead. See full patch in the blog:

→ View original post on X — @ai21labs,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *