i might have heard the same — I guess info like this is passed around but no one wants to say it out loud.
GPT-4: 8 x 220B experts trained with different data/task distributions and 16-iter inference.
Glad that Geohot said it out loud. Though, at this point, GPT-4 is
@soumithchintala
-

GPT-4 Architecture: 8 Experts with 16-Iteration Inference
By
–
-
Mistral Account Impersonation Report Request
By
–
can you create an official Mistral account so that we can report this for impersonation
-
LLMs and Poetry Generation: A Critical Perspective
By
–
Long live LLMs for gifting us the wonders of poetry! /s