AI Dynamics

Global AI News Aggregator

About

Anthropic Interpretability Paper on Scaling Monosemanticity

The new interpretability paper from Anthropic is totally based. Feels like analyzing an alien life form. If you only read one 90-min-read paper today, it has to be this one https://
transformer-circuits.pub/2024/scaling-m
onosemanticity/index.html

→ View original post on X — @thom_wolf