AI Dynamics

Global AI News Aggregator

Comparing MoE Architecture: v3 vs v2.5 Model Configurations

Looking at the config.json for both the models:
v3 (left) vs v2.5 (right) Interesting things: MoE related:
v3: "moe_intermediate_size": 2048, "n_routed_experts": 256, "n_shared_experts": 1, "num_experts_per_tok": 8 v2: "moe_intermediate_size": 1536, "n_routed_experts": 160,

→ View original post on X — @reach_vb,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *