The Computer Use Agent (CUA) enables AI models to interact with any software through a screenshot-action loop: the model views a screenshot, decides what to click/type/scroll, and the action is executed in a virtual environment.
| Action | Description | Example |
|---|---|---|
| click | Click at coordinates (x, y) | Click "Submit" button |
| type | Type text into focused field | Enter email address |
| scroll | Scroll in a direction | Scroll down to see more results |
| keypress | Press keyboard shortcuts | Ctrl+S to save |
| screenshot | Capture current state | Observe changes after action |
const response = await openai.responses.create({
model: "computer-use-preview",
tools: [{
type: "computer_use_preview",
display_width: 1024,
display_height: 768,
environment: "browser"
}],
input: "Go to Hacker News and find today's top story"
});