@scaling01 Before your pivot to Star Wars memes, I remember you used to be interested in LLMs! I just built a logic reasoning / problem solving benchmark where frontier models one-shot solutions, but the open weights models really struggle:
Logic Reasoning Benchmark: Frontier vs Open Weight Models
By
–
