MAIA is built from a pre-trained vision-language model equipped w/a set of commonly used vision interpretability tools for synthesizing and editing inputs, computing maximally activating exemplars from existing datasets, and summarizing & describing experimental results.
MAIA: Vision-Language Model with Interpretability Tools
By
–
Leave a Reply