Today, we're announcing Mistral NeMo, a tiny multilingual model, 128k context length, trained with quantization awareness in collaboration with the NVIDIA research team.
Mistral NeMo: Tiny Multilingual Model with 128k Context
By
–
By
–
Today, we're announcing Mistral NeMo, a tiny multilingual model, 128k context length, trained with quantization awareness in collaboration with the NVIDIA research team.