We’ve heard mixture-of-experts (MoE) were in the air (GPT4??) so we’ve just added the first one in the transformers library for you to play with 🙂 With for nothing less than a 1.3 trillion parameters checkpoint model on the hub! The largest model on the hub at the moment
Hugging Face Releases 1.3T Parameter Mixture-of-Experts Model
By
–
Leave a Reply