These features are remarkably abstract, often representing the same concept across contexts and languages, even generalizing to image inputs. Importantly, they also causally influence the model’s outputs in intuitive ways.
Abstract Features in AI Models Causally Influence Outputs
By
–