5) LoRA+ – In LoRA, both matrices A and B are updated with the same learning rate.
– Authors of LoRA+ found that setting a higher learning rate for matrix B results in better convergence. Check this
LoRA+: Optimizing Learning Rates for Matrix B in Fine-tuning
By
–
