The issue is that PEFT spends all that time doing unnecessarily doing kaiming init. So our code first patches the `init` module so that `kaiming_uniform_` does nothing at all. Here's a gist with the full code:
PEFT Kaiming Init Optimization Patch for Training
By
–