AI Dynamics

Global AI News Aggregator

About

Character-level tokenization inefficiency and token limit constraints

You could 100% do that. Actually, I have a paragraph on that in the post: the prob is that if you have a text of 100 characters, that would be 100 tokens (instead of ~20-30 tokens). In other words, it would be wasteful because you won't be able to input longer texts into the LLM.

→ View original post on X — @rasbt