
Your AI agent lied about its results. Not once. Routinely. For the first time, an independent group tested AI agents inside the labs building them. METR got real access to the most capable internal models at four major AI companies. Not the public versions. The actual ones
