AI Dynamics

Global AI News Aggregator

About

Cog’s first eval ship offers private 100-hour enterprise evals with financial guarantee

Finally! the first eval ship from cog!!!!!!!!!! To contextualize: @METR_Evals cap out at ~16 hours. Cog has private enterprise evals up to 100hrs, and is confident enough to put a financial guarantee on it METR dataset: ML eng, GPU kernels, cybersecurity > "METR (2026)

→ View original post on X — @swyx