CuaGuideGet Started

What is Cua?

Learn about Cua, the open-source platform for building computer-use agents

Cua is an open-source platform for building, deploying, and managing computer-use agents. It provides everything you need to create AI agents that can interact with desktop environments just like a human would—clicking, typing, navigating, and completing tasks autonomously.

Cua

The Cua Stack

Cua consists of three main components:

1. Desktop Sandboxes

Isolated virtual environments where your agents can safely execute tasks. Cua supports:

  • Cloud Sandboxes - Managed Linux, Windows, and macOS environments hosted by Cua
  • Local Sandboxes - Docker containers, QEMU VMs, macOS VMs via Lume, or Windows Sandbox on your own machine

2. Computer Framework

A unified SDK for controlling desktop environments programmatically:

  • Take screenshots and observe the screen
  • Simulate mouse clicks, movements, and scrolling
  • Type text and press keyboard shortcuts
  • Works identically across all sandbox types

3. Agent Framework

Tools for building AI agents that understand and interact with GUIs:

  • 100+ vision-language model options through Cua VLM Router or direct provider access
  • Pre-built agent loops optimized for computer-use tasks
  • Composable architecture for combining grounding and planning models
  • Built-in telemetry for monitoring agent performance

Why Cua?

Traditional automation tools require you to write explicit scripts for every action. Cua agents can:

  • Understand visual context - See what's on screen and make decisions
  • Adapt to changes - Handle UI variations without breaking
  • Complete complex tasks - Chain multiple actions to achieve goals
  • Work across applications - Navigate between different programs seamlessly

Use Cases

Cua is ideal for:

  • Workflow automation - Automate repetitive tasks across any application
  • Testing - Run end-to-end tests that interact with real UIs
  • Data extraction - Scrape information from applications that don't have APIs
  • RPA (Robotic Process Automation) - Enterprise automation with AI understanding
  • Benchmarks - Evaluate computer-use agents on standardized tasks (OSWorld, ScreenSpot, etc.)
  • RL/Gym Environments - Train reinforcement learning agents with desktop environments as Gym-compatible envs
  • Research - Build and evaluate computer-use AI agents

Getting Started

Ready to build your first agent? Continue to Set Up a Sandbox to set up your environment and run your first automation.

Was this page helpful?


On this page