Good point. They brought us multi-head latent attention last summer.
(I'm not counting mixture-of-experts, because OpenAI uses it regardless of constraints :P)
Multi-head Latent Attention Innovation and OpenAI’s Mixture-of-Experts
By
–
Global AI News Aggregator
By
–
Good point. They brought us multi-head latent attention last summer.
(I'm not counting mixture-of-experts, because OpenAI uses it regardless of constraints :P)
Leave a Reply