Qualitatively, you can use python generate/lora.py –lora_path '… .pth' –quantize "bnb.nf4" –precision "bf16-true" –checkpoint_dir "…/Llama-2-7b-hf" Quantitatively, you can use the python eval/eval_harness scripts: https://
github.com/Lightning-AI/l
it-gpt/tree/main/eval
…
LoRA Quantization and Evaluation Scripts for Llama-2 Models
By
–
Leave a Reply