AI Dynamics

Global AI News Aggregator

About

Falcon 40B Outperforms LLaMa 65B With Half Training Compute

Nobody's been talking about it but it's rather *mind-blowing* imo that the open-source Flacon 40B model is topping LLaMa 65B on leaderboards and many evals while having required not even half the compute of LLaMa to train from scratch Quick back of the envelop calculations:

→ View original post on X — @thom_wolf,