AI Dynamics

Global AI News Aggregator

Chinchilla Scaling Laws: Compute Optimality vs Convergence Point

no. people misunderstand chinchilla.
chinchilla doesn't tell you the point of convergence.
it tells you the point of compute optimality.
if all you care about is perplexity, for every FLOPs compute budget, how big model on how many tokens should you train?
for reasons not fully

→ View original post on X — @karpathy,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *