AI Dynamics

Global AI News Aggregator

About

Model Merging Techniques and Hidden Representation Dynamics

I'd say it barely works tbh… 🙂 It really depends on the type of merge and whether you retrain it or not. In this case, it's not really "working". The additional layers seem to push the hidden representations into new areas, which may or may not be interesting, depending on

→ View original post on X — @maximelabonne,