AI Dynamics

Global AI News Aggregator

LLMs as CPUs: Understanding Agent Harness Infrastructure

A raw LLM is just like a CPU without OS. It can compute. But it can't do anything useful on its own. This analogy is the clearest way I've found to understand what an agent harness actually does. Here's the mapping: • 𝗖𝗣𝗨 → 𝗟𝗟𝗠 (model weights). The raw compute engine. Powerful, but useless without infrastructure around it. • 𝗥𝗔𝗠 → 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝘄𝗶𝗻𝗱𝗼𝘄. Fast, always available, but limited. When it fills up, you start losing things. • 𝗛𝗮𝗿𝗱 𝗱𝗶𝘀𝗸 → 𝗩𝗲𝗰𝘁𝗼𝗿 𝗗𝗕 / 𝗹𝗼𝗻𝗴-𝘁𝗲𝗿𝗺 𝘀𝘁𝗼𝗿𝗮𝗴𝗲. Large capacity, but slow to access. You retrieve from it, not compute in it. • 𝗗𝗲𝘃𝗶𝗰𝗲 𝗱𝗿𝗶𝘃𝗲𝗿𝘀 → 𝗧𝗼𝗼𝗹 𝗶𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗶𝗼𝗻𝘀. The interfaces that let the model interact with the outside world. Code execution, web search, file I/O. • 𝗢𝗽𝗲𝗿𝗮𝘁𝗶𝗻𝗴 𝘀𝘆𝘀𝘁𝗲𝗺 → 𝗔𝗴𝗲𝗻𝘁 𝗵𝗮𝗿𝗻𝗲𝘀𝘀. This is the key layer. It manages everything: which tools to call, what fits in memory, when to retrieve, how to recover from errors, and when to stop. And then there's the 𝗮𝗽𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻 layer. That's the "agent" itself. Not a piece of software you install, but emergent behavior that arises when the OS does its job well. This is why two products using the exact same model can perform completely differently. LangChain changed only their harness infrastructure (same model, same weights) and jumped from outside the top 30 to rank 5 on TerminalBench 2.0. The model didn't improve. The operating system around it did. The article below is a deep dive on agent harness engineering, covering the orchestration loop, tools, memory, context management, and everything else that transforms a stateless LLM into a capable agent. Akshay 🚀 (@akshay_pachaar) x.com/i/article/204073208484… — https://nitter.net/akshay_pachaar/status/2041146899319971922#m

→ View original post on X — @akshay_pachaar, 2026-04-07 08:30 UTC

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *