For comparison, when I give the same prompt to ChatGPT 4o, but replacing the last sentence with “Think step-by-step,” it returns 0 correct answers in 10 samples: “salubrious,” “abstemious” (x4), “illegible,” “repetitive,” “luminously”, “abstinential,” and “inexpensive.”
Prompting ChatGPT-4o with chain-of-thought reduces correctness
By
–