AI Dynamics

Global AI News Aggregator

False Friends: Multilingual Tokenizer Overlap Improves Transfer Learning

In a last-minute change of events, I won’t be attending #EMNLP2025 in person. Still, I’m excited to share our poster for our paper, False Friends! nitter.net/JulieKallini/status/19… Julie Kallini ✨ (@JulieKallini) New paper! 🌈 In English, pie = 🥧. In Spanish, pie = 🦶. Multilingual tokenizers often share such overlapping tokens between languages. Do these “False Friends” hurt or help multilingual LMs? We find that overlap consistently improves transfer—even when it seems misleading. 🧵 — https://nitter.net/JulieKallini/status/1972689319388172669#m

→ View original post on X — @jurafsky, 2025-11-03 18:28 UTC

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *