“Collaborative Battleship” MIT & Harvard developed a collaborative version of Battleship to see if AI agents are as good at asking questions as answering them. They found that many LMs struggle w/critical thinking, but Monte Carlo inference strategies can help even tiny
MIT Harvard Study AI Agents Critical Thinking Battleship
By
–
