I still haven't seen a good evaluation of the differences between different quantized versions of why models to be honest – or even any good anecdotes about things that work and things that don't at different levels So I'm effectively flying blind when it comes to quantization
Quantization Evaluation Gap in LLM Model Versions
By
–
Leave a Reply