Ai2 launched MoNaCo, a new eval that tests how well models stitch together evidence across dozens of sources It includes 1,315 multi‑step questions, retrieval, filtering & aggregation across text and tables, and 40+ distinct documents per query
Ai2 Launches MoNaCo: Multi-Step Evidence Synthesis Benchmark
By
–
