AI Dynamics

Global AI News Aggregator

About

Research: Humans Outsmart Automated LLM Defenses

Humans, noted virtuosi of adversarial yap, remain #1 at trolling LLMs! New research from @scale_AI
's SEAL team shows human red teamers achieve 70%+ success rates against LLM defenses that stump automated attacks, exploiting their susceptibility to multi-turn jailbreaks.

→ View original post on X — @goodside,