Great set of slides by my Gemini colleague @FeinbergVlad on scaling considerations in large language models, addressing the fact that the classic "scaling law" work does not take into account inference cost (!), distillation, learning rate schedules, etc. https://
vladfeinberg.com/2025/04/24/gem
ini-flash-pretraining.html
…
Scaling Laws in LLMs: Beyond Classic Models and Inference Costs
By
–
Leave a Reply