


Sonnet-4.6 takes top place on all my evals: EQ-Bench, Creative writing, Longform writing & Judgemark. Opus 4.6 within margin of error. GLM-5 and Qwen3.5-397B nipping at their heels.
→ View original post on X — @maximelabonne, 2026-02-18 21:33 UTC
Global AI News Aggregator
By
–




Sonnet-4.6 takes top place on all my evals: EQ-Bench, Creative writing, Longform writing & Judgemark. Opus 4.6 within margin of error. GLM-5 and Qwen3.5-397B nipping at their heels.
→ View original post on X — @maximelabonne, 2026-02-18 21:33 UTC
Leave a Reply