AI Dynamics

Global AI News Aggregator

About

Mixture of Experts: Historical Context and Evolution

oh cool i didn't know about this! apparently MoEs are from the 90s. they're still not in the textbook. i had thought the first real implementation was from Shazeer et al. 2017:

→ View original post on X — @jxmnop