oh cool i didn't know about this! apparently MoEs are from the 90s. they're still not in the textbook. i had thought the first real implementation was from Shazeer et al. 2017:
Mixture of Experts: Historical Context and Evolution
By
–
Global AI News Aggregator
By
–
oh cool i didn't know about this! apparently MoEs are from the 90s. they're still not in the textbook. i had thought the first real implementation was from Shazeer et al. 2017:
Leave a Reply