Dug into the config files a bit, key differences (according to the config files) v2 vs v3: vocab_size: v2: 102400
v3: 129280 hidden_size:
v2: 4096
v3: 7168 intermediate_size:
v2: 11008
v3: 18432 num_hidden_layers:
v2: 30
v3: 61 num_attention_heads:
v2: 32
v3: 128
Llama v3 vs v2: Key Architecture Configuration Differences
By
–
Leave a Reply