Comparing MoE Architecture: v3 vs v2.5 Model Configurations

AI Dynamics

Global AI News Aggregator

Comparing MoE Architecture: v3 vs v2.5 Model Configurations

–

25 December 2024 19h49

Looking at the config.json for both the models:
v3 (left) vs v2.5 (right) Interesting things: MoE related:
v3: "moe_intermediate_size": 2048, "n_routed_experts": 256, "n_shared_experts": 1, "num_experts_per_tok": 8 v2: "moe_intermediate_size": 1536, "n_routed_experts": 160,

→ View original post on X — @reach_vb,

25 December 2024

AI CODE COMPUTING GENERATIVE AI LLMS MACHINE LEARNING RESEARCH SYSTEMS

AI Dynamics

Comparing MoE Architecture: v3 vs v2.5 Model Configurations

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

The Only Real Bet We Have for the Future

wacrawl 0.2.0: Encrypted Git Backup for WhatsApp

Elon Musk shifts focus to engineering work

MyOneApp Failure: The Bundling Trap in Product Design