AI Dynamics

Global AI News Aggregator

Llama 3.1 405B: Training the Largest Model at Scale

Training a model as large and capable as Llama 3.1 405B was no simple task. The model was trained on over 15 trillion tokens over the course of several months requiring over 16K @NVIDIA H100 GPUs — making it the first Llama model ever trained at this scale. We also used the 405B

→ View original post on X — @aiatmeta,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *