All‑in‑one CUA Models
Models that support full computer-use agent capabilities with ComputerAgent.run()
These models support complete computer-use agent functionality through ComputerAgent.run()
. They can understand natural language instructions and autonomously perform sequences of actions to complete tasks.
All agent loops are compatible with any LLM provider supported by LiteLLM.
See Running Models Locally for how to use Hugging Face and MLX models on your own machine.
Anthropic CUAs
Claude models with computer-use capabilities:
- Claude 4.5:
claude-sonnet-4-5-20250929
- Claude 4.1:
claude-opus-4-1-20250805
- Claude 4:
claude-opus-4-20250514
,claude-sonnet-4-20250514
- Claude 3.7:
claude-3-7-sonnet-20250219
- Claude 3.5:
claude-3-5-sonnet-20241022
agent = ComputerAgent("claude-3-5-sonnet-20241022", tools=[computer])
async for _ in agent.run("Open Firefox and navigate to github.com"):
pass
OpenAI CUA Preview
OpenAI's computer-use preview model:
- Computer-use-preview:
computer-use-preview
agent = ComputerAgent("openai/computer-use-preview", tools=[computer])
async for _ in agent.run("Take a screenshot and describe what you see"):
pass
GLM-4.5V
Zhipu AI's GLM-4.5V vision-language model with computer-use capabilities:
openrouter/z-ai/glm-4.5v
huggingface-local/zai-org/GLM-4.5V
agent = ComputerAgent("openrouter/z-ai/glm-4.5v", tools=[computer])
async for _ in agent.run("Click on the search bar and type 'hello world'"):
pass
InternVL 3.5
InternVL 3.5 family:
huggingface-local/OpenGVLab/InternVL3_5-{1B,2B,4B,8B,...}
agent = ComputerAgent("huggingface-local/OpenGVLab/InternVL3_5-1B", tools=[computer])
async for _ in agent.run("Open Firefox and navigate to github.com"):
pass
UI-TARS 1.5
Unified vision-language model for computer-use:
huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B
huggingface/ByteDance-Seed/UI-TARS-1.5-7B
(requires TGI endpoint)
agent = ComputerAgent("huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B", tools=[computer])
async for _ in agent.run("Open the settings menu and change the theme to dark mode"):
pass
CUAs also support direct click prediction. See Grounding Models for details on predict_click()
.
For details on agent loop behavior and usage, see Agent Loops.