Ever since @METR_Evals
' fascinating work on LLM agent time horizons, I've wanted to see other attempts to draw conclusions from the same data. In a separate approach by Fengyuan and Jay, we too infer an exponential, but with shorter horizons for recent models (~2h vs METR's ~5h)
LLM Agent Time Horizons: New Analysis on Model Capabilities
By
–