BullshitBench: sorry to say but DeepSeek v4 did really badly, towards the bottom of the table, whether it is high or low reasoning.
By
–

BullshitBench: sorry to say but DeepSeek v4 did really badly, towards the bottom of the table, whether it is high or low reasoning.