BullshitBench: GPT-5.5 and 5.5-Pro update! They did NOT do well – 5.5 about the same level as GPT-5.4 (around 30-35 rank, 45% pushback). GPT-5.5-Pro did WORSE – only about 35% pushback. I must say the Pro result kind of shocked me. This is actually interesting, what this tells
GPT-5.5 and Pro Models Underperform on BullshitBench
By
–
