This is worth noting: it appears to be the first evidence that LLMs can improve performance at a task through self-play. In this case, Llama-2 7B was able to greatly improve its ability to play the game of Adversarial Taboo by training through adversarial competition with itself
Llama-2 Improves Game Performance Through Adversarial Self-Play
By
–
