new paper! 🫡 why are state space models (SSMs) worse than Transformers at recall over their context? this is a question about the mechanisms underlying model behaviour: therefore, we propose using mechanistic evaluations to answer it!
State Space Models vs Transformers: Mechanistic Evaluation of Recall
By
–

Leave a Reply