AI Dynamics

Global AI News Aggregator

About

GPT-2 Weight Initialization and Fine-tuning Strategy

atm we're doing init from gpt-2 weights and finetuning. this was very useful for debugging and when the code was slower. there is no code yet to init from scratch, so no code to warmup the lr etc. should be a very short addition though.

→ View original post on X — @karpathy