What the monitor gets to read and the capability of the monitor matters. Stronger monitors that can read CoTs and use more test-time compute get much better fast. Also, post-hoc follow-ups (by asking the model to elaborate) often surface previously unspoken thoughts and boost
Monitor Capability and Chain-of-Thought Reading Enhance AI Performance
By
–