Note in the video I was using the non-quantized Mistral 7B instruct. You can imagine how much faster it gets if you were to use a quantized version.
Mistral 7B Performance Optimization Through Quantization
By
–
By
–
Note in the video I was using the non-quantized Mistral 7B instruct. You can imagine how much faster it gets if you were to use a quantized version.