AI Dynamics

Global AI News Aggregator

GPT-2 Weight Initialization and Fine-tuning Strategy

atm we're doing init from gpt-2 weights and finetuning. this was very useful for debugging and when the code was slower. there is no code yet to init from scratch, so no code to warmup the lr etc. should be a very short addition though.

→ View original post on X — @karpathy,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *