Crosscoders (published today: https://
transformer-circuits.pub/2024/crosscode
rs/index.html
…) are a new method allowing us to find features shared across different layers in a model, or even across different models. Identifying the same feature when it persists across layers can simplify our understanding of models.
Crosscoders: Finding Shared Features Across Model Layers
By
–
