It is easy enough to make your own, but I think standard relu should have been defined as passing the value at zero, so gradients flow backward through it, allowing some things to be zero weight initialized when symmetry breaking isn’t an issue.
By
–
It is easy enough to make your own, but I think standard relu should have been defined as passing the value at zero, so gradients flow backward through it, allowing some things to be zero weight initialized when symmetry breaking isn’t an issue.