AI Dynamics

Global AI News Aggregator

About

Training Deep Networks Without Normalization Using Parameterized Tanh

New paper: turns out you can train deep nets without normalization layers by replacing them with a parameterized tanh()

→ View original post on X — @ylecun