New post: We tested the Mythos showcase vulnerabilities with open models.
— Stanislav Fort (@stanislavfort) 8 avril 2026
They recovered similar scoped analysis! 8/8 models found the flagship FreeBSD zero-day, including a 3B model.
Rankings reshuffle completely across tasks => the AI cybersecurity frontier is super jagged! pic.twitter.com/6DxKN2xJUw
New post: We tested the Mythos showcase vulnerabilities with open models. They recovered similar scoped analysis! 8/8 models found the flagship FreeBSD zero-day, including a 3B model. Rankings reshuffle completely across tasks => the AI cybersecurity frontier is super jagged!