Yeah but interestingly I also sometimes find that 1/2x the rank works better. Like you said it may be the data distribution. But yeah overall it’s a hyperparameter that sometimes has its own mind
LoRA Rank as an Unpredictable Hyperparameter in Model Fine-tuning
By
–
Leave a Reply