For a deep dive on how CUA is trained—its pixel-level vision, chain-of-thought reasoning, the actions it can perform (typing, clicking, scrolling), and how it stays safe—check out our blog post:
Deep Dive into CUA Training: Vision, Reasoning, Actions, and Safety
By
–