Yes, I think that's either maybe so that (1) people with older cards can use these weights as well and (2) we maybe have more stability when finetuning with LoRA etc (in case numbers become large)
It's interesting for sure
Weights compatibility and numerical stability in LoRA fine-tuning
By
–
Leave a Reply