AI Dynamics

Global AI News Aggregator

Llama 3 8B Performance Comparable to Llama 2 70B Model

The model card has some more interesting info too: https://
github.com/meta-llama/lla
ma3/blob/main/MODEL_CARD.md
… Note that Llama 3 8B is actually somewhere in the territory of Llama 2 70B, depending on where you look. This might seem confusing at first but note that the former was trained for 15T tokens, while the

→ View original post on X — @karpathy,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *