AI Dynamics

Global AI News Aggregator

About

Regex bottleneck in tokenizer requires expert optimization

Yep, the use of regex is both a huge dependency and huge bottleneck in the tokenizer. I think it's a beautiful project to try to do this correctly, but I'd need someone who is really familiar with regex to pitch in and also a large test suite to make sure. I'd love to merge such

→ View original post on X — @karpathy