AI Dynamics

Global AI News Aggregator

Memory Allocation Strategy in LLM Training Implementation

You can look at the raw training implementation here: https://
github.com/karpathy/llm.c
/blob/master/train_gpt2.c
… You'll see that we allocate all the required memory a single time in the beginning in one large block of 1D memory. From there on during training, no memory gets created or destroyed, so we stay at

→ View original post on X — @karpathy,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *