AI Dynamics

Global AI News Aggregator

Inference Mode Quantization with BNB NF4 for CodeLlama

And if you are looking for inference mode quantization, you can use `–quantize "bnb.nf4" with the "generate/base.py" scripts as well.
I am currently using that for running CodeLlama 34B models.

→ View original post on X — @rasbt,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *