AI Dynamics

Global AI News Aggregator

About

Emergent Complexity in Neural Networks Through Gradient Descent

the non-linearity by itself isn't complex, something as simple as max(x, 0).
the system of two matmuls and a max in between, combined with gradient descent — that system becomes fairly complex.

→ View original post on X — @soumithchintala,