the optimization dynamics of a two-layer perceptron are barely understood. that's just two matmuls and one pointwise non-linear function in between
Understanding Two-Layer Perceptron Optimization Dynamics
By
–
By
–
the optimization dynamics of a two-layer perceptron are barely understood. that's just two matmuls and one pointwise non-linear function in between