New paper: turns out you can train deep nets without normalization layers by replacing them with a parameterized tanh()
Training Deep Networks Without Normalization Using Parameterized Tanh
By
–

By
–

New paper: turns out you can train deep nets without normalization layers by replacing them with a parameterized tanh()