[ ABORT TO HUD ]
SEQ. 1

The Computer Use Agent

🖥️ Computer Use (CUA) 15 min 300 BASE XP

AI That Operates Your Computer

The Computer Use Agent (CUA) enables AI models to interact with any software through a screenshot-action loop: the model views a screenshot, decides what to click/type/scroll, and the action is executed in a virtual environment.

How CUA Works

  1. Screenshot: Capture the current screen state
  2. Reasoning: The model analyzes the screenshot and decides the next action
  3. Action: Execute the action (click, type, scroll, drag)
  4. Repeat: Capture new screenshot, continue until task is complete

Supported Actions

ActionDescriptionExample
clickClick at coordinates (x, y)Click "Submit" button
typeType text into focused fieldEnter email address
scrollScroll in a directionScroll down to see more results
keypressPress keyboard shortcutsCtrl+S to save
screenshotCapture current stateObserve changes after action
const response = await openai.responses.create({
  model: "computer-use-preview",
  tools: [{
    type: "computer_use_preview",
    display_width: 1024,
    display_height: 768,
    environment: "browser"
  }],
  input: "Go to Hacker News and find today's top story"
});
⚠️ Safety Warning: Always run CUA in sandboxed environments (Docker, VMs, cloud sandboxes). Never give CUA access to your actual desktop — it could click on anything, including system settings or sensitive applications.
SYNAPSE VERIFICATION
QUERY 1 // 3
How does the Computer Use Agent interact with software?
Through APIs
Via a screenshot-action loop: view screen → decide action → execute → repeat
By reading source code
Through keyboard macros only
Watch: 139x Rust Speedup
The Computer Use Agent | Computer Use (CUA) — OpenAI Academy