From testing the Mixtral 8x22B base model, it seems pretty clear that a lot of instruction/chat data was trained into the base model. Even without fine-tuning, it's already capable of acting as a pretty great assistant, just via prompting.
Mixtral 8x22B Base Model Shows Strong Assistant Capabilities
By
–