AI Dynamics

Global AI News Aggregator

Probability-based sequence generation normalization in language models

19/ In this case, we use probabilities again but this time we compute the probability of generating the full answer sequence, not just the letter: we sum the log of the probabilities and compute a normalization by dividing by the number of tokens to not penalize longer sequences.

→ View original post on X — @thom_wolf,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *