AI Dynamics

Global AI News Aggregator

About

Retokenization and Language Knowledge in Model Training

The biggest question is whether you allow re-tokenization, and whether that should be done with the same data as the training itself. Right now there is knowledge about the language in existing tokens built-in and changing that is against the rules and/or unfavorable.

→ View original post on X — @alexjc,