Very happy to release our new small model, Mistral NeMo, a 12B model trained in collaboration with @nvidia
. Mistral NeMo supports a context window of 128k tokens, comes with a FP8 aligned checkpoint, and performs extremely well on all benchmarks. Check it out!
Mistral NeMo 12B Model Release with NVIDIA Collaboration
By
–
Leave a Reply