Deep dive into tokenization vulnerabilities across multiple language models

AI Dynamics

Global AI News Aggregator

Deep dive into tokenization vulnerabilities across multiple language models

–

12 May 2024 11h36

Nice new read on tokenization!
You've heard about the SolidGoldMagikarp token, which breaks GPT-2 because it was present in the training set of the Tokenizer, but not the LLM later. This paper digs in in a lot more depth and detail, on a lot more models, discovering a less

→ View original post on X — @karpathy,

12 May 2024

AI Dynamics

Deep dive into tokenization vulnerabilities across multiple language models

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

AI Generates Perfect Jokes Using Image Generation Skills

Codex App Transformation: Atlas Integration Reshapes User Experience

AI File Access Limitations: Screenshot vs Disk Storage Issues

Synthetic Aperture Radar: Satellite Tech for Global Monitoring