I think 4.5 was the first really notable one. Benchmark-wise it sat between between OpenAI o3 and Claude 4 Opus.
I have a brief section about it in my Big LLM Architecture Comparison article: https://
magazine.sebastianraschka.com/i/168650848/11
-glm-45
…
GLM-4.5: Notable benchmark performance between o3 and Claude
By
–
Leave a Reply