Because we focused on long context, we took the L-eval versions of the datasets and also turned them to 3-shot format to challenge with longer context handling. Mixtral's performance was measured in the same setting as Jamba.
Long Context Evaluation: Jamba vs Mixtral Performance Comparison
By
–
Leave a Reply