AI Dynamics

Global AI News Aggregator

GPT-4 Architecture: 8 Experts with 16-Iteration Inference

i might have heard the same — I guess info like this is passed around but no one wants to say it out loud.
GPT-4: 8 x 220B experts trained with different data/task distributions and 16-iter inference.
Glad that Geohot said it out loud. Though, at this point, GPT-4 is

→ View original post on X — @soumithchintala,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *