AI Dynamics

Global AI News Aggregator

GPT-4 Vision Capabilities: Multimodal Reasoning and Image Generation

I haven't even touched on the vision capabilities possible in gpt-4 some obvious ones:
-multi-modal reasoning and action (e.g. MM-REACT paper)
-better image gen (imagine gpt-4 iterates on your Midjourney prompts) -few-shot prompting and learning through images/videos

→ View original post on X — @alexalbert__,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *