COAR works by estimating the contribution of each component to the model’s final output. These component attributions act as a counterfactual estimator, enabling targeted model edits w/o additional training.
COAR: Component Attribution for Targeted Model Edits
By
–