Today, we release Mistral Large 2, the new version of our largest model. Mistral Large 2 is a 123B-parameter model with a 128k context window. On many benchmarks (notably in code generation and math), it is superior or on par with Llama 3.1 405B. Like Mistral NeMo, it was trained
Mistral Large 2: New 123B Model Outperforms Llama 3.1 405B
By
–
Leave a Reply