This is the training degeneracy that @jiawzhao is talking about if you only do identity initialization and not Hadamard for rectangular weight matrices
Training Degeneracy in Rectangular Weight Matrices Without Hadamard
By
–
Global AI News Aggregator
By
–
This is the training degeneracy that @jiawzhao is talking about if you only do identity initialization and not Hadamard for rectangular weight matrices
Leave a Reply