Seeing as I published my Tokenizer video yesterday, I thought it could be fun to take a deepdive into the Gemma tokenizer. First, the Gemma technical report [pdf]: https://
storage.googleapis.com/deepmind-media
/gemma/gemma-report.pdf
… says: "We use a subset of the SentencePiece tokenizer (Kudo and Richardson, 2018) of
Deep dive into Gemma tokenizer and SentencePiece implementation
By
–
Leave a Reply