What is a modality or a multimodal model? A multimodal model is a tongue twister. GPT4 is OpenAI's first multimodal model, which just means that it can take multiple modalities, or multiple inputs like text, and image in this case, and in the future, other modalities (eg audio).
Leave a Reply