AI Dynamics

Global AI News Aggregator

Why LLMs Process Text Instead of Raw Pixels

Interestingly the native and most general medium of existing infrastructure wrt I/O are screens and keyboard/mouse/touch. But pixels are computationally intractable atm, relatively speaking. So it's faster to adapt (textify/compress) the most useful ones so LLMs can act over them

→ View original post on X — @karpathy,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *