AI Dynamics

Global AI News Aggregator

About

Mechanistic Interpretability: Noble But Disconnected from AI Safety

I also think the sub field of mechanistic interpretability is very cool — it’s all three noble, challenging, and interesting — I just struggle to see how it connects to the broader goals (building AI systems that don’t kill us) or at least why it’s a top priority

→ View original post on X — @jxmnop