1) LoRA – Add two low-rank trainable matrices, A and B, alongside weight matrices.
– Instead of fine-tuning W, adjust the updates in these low-rank matrices. Even for the largest of LLMs, LoRA matrices take up a few MBs of memory. Check this
LoRA: Efficient Fine-tuning with Low-Rank Matrices
By
–
Leave a Reply