Our prior work showed that these toy models use a strategy called “superposition” to learn more features than available neurons. Here we observe how training data points, as well as features, are embedded in the hidden space.
Superposition Strategy: How Neural Networks Embed Features in Hidden Space
By
–
Leave a Reply