AI Dynamics

Global AI News Aggregator

About

Training Cost Estimation Errors in Large Language Models

I guess this is where we disagree about whether an error of 3261X in estimating the cost of training a language model, or conflating a completely different one-time task to perform a neural architecture search (and overestimating that cost by 88X) to find more energy efficient

→ View original post on X — @jeffdean,