The Multi-LLM AB-MCTS combination of o4-mini + Gemini-2.5-Pro + DeepSeek-R1-0528, current frontier AI models, achieves strong performance on the ARC-AGI-2 benchmark, outperforming individual models by a large margin. Implementation of AB-MCTS on GitHub: https://
github.com/SakanaAI/treeq
uest
…
Multi-LLM AB-MCTS Combination Outperforms on ARC-AGI-2 Benchmark
By
–
Leave a Reply