this is interesting.
— Gary Marcus (@GaryMarcus) 8 avril 2026
1. Did Anthropic forget to run a control?
2. Where does this leave us? https://t.co/frF8gNNrvO
this is interesting. 1. Did Anthropic forget to run a control? 2. Where does this leave us? Stanislav Fort (@stanislavfort) New post: We tested the Mythos showcase vulnerabilities with open models. They recovered similar scoped analysis! 8/8 models found the flagship FreeBSD zero-day, including a 3B model. Rankings reshuffle completely across tasks => the AI cybersecurity frontier is super jagged! — https://nitter.net/stanislavfort/status/2041922370206654879#m