AB-MCTS Search Capability Evaluation with High Pass@k Metrics

AI Dynamics

Global AI News Aggregator

AB-MCTS Search Capability Evaluation with High Pass@k Metrics

–

01 July 2025 14h00

Thanks @fchollet
. Indeed, the experiments used a large Pass@k which allowed us to focus on evaluating the search capability of AB-MCTS, rather than the official evaluation criteria based on k=2. We also used tasks in the public eval. Hopefully we’ll get down to Pass@2 someday! 🙂

→ View original post on X — @hardmaru,

1 July 2025

AI Dynamics

AB-MCTS Search Capability Evaluation with High Pass@k Metrics

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

Choosing Survival: The Cost of Edge Cases in Difficult Decisions

Hyperloop Transformers: Memory-Efficient LLM via Looped Architecture

Chinese Geely Robotaxi Concept Challenges Tesla’s Market Position

Top 10 Strategic Technology Trends for 2026