LogoCua Documentation

Quickstart

Get started with Cua in three steps

This quickstart guides you through setting up your computer environment, programmatic control with a Cua computer, and task automation with a Cua agent:

Set Up Your Computer Environment

Choose how you want to run your Cua computer. This will be the environment where your automated tasks will execute.

You can run your Cua computer in the cloud (recommended for easiest setup), locally on macOS with Lume, locally on Windows with a Windows Sandbox, or in a Docker container on any platform. Choose the option that matches your system and needs.

Cua Cloud Sandbox provides virtual machines that run Ubuntu.

  1. Go to trycua.com/signin
  2. Navigate to Dashboard > Containers > Create Instance
  3. Create a Medium, Ubuntu 22 sandbox
  4. Note your sandbox name and API key

Your Cloud Sandbox will be automatically configured and ready to use.

Lume containers are macOS virtual machines that run on a macOS host machine.

  1. Install the Lume CLI:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"
  1. Start a local Cua sandbox:
lume run macos-sequoia-cua:latest

Windows Sandbox provides Windows virtual environments that run on a Windows host machine.

  1. Enable Windows Sandbox (requires Windows 10 Pro/Enterprise or Windows 11)
  2. Install the pywinsandbox dependency:
pip install -U git+git://github.com/karkason/pywinsandbox.git
  1. Windows Sandbox will be automatically configured when you run the CLI

Docker provides a way to run Ubuntu containers on any host machine.

  1. Install Docker Desktop or Docker Engine:

  2. Pull the CUA Ubuntu sandbox:

docker pull --platform=linux/amd64 trycua/cua-ubuntu:latest

Using Computer

Connect to your Cua computer and perform basic interactions, such as taking screenshots or simulating user input.

Install the Cua computer Python SDK:

pip install cua-computer

Then, connect to your desired computer environment:

from computer import Computer

computer = Computer(
    os_type="linux",
    provider_type="cloud",
    name="your-container-name",
    api_key="your-api-key"
)
await computer.run() # Connect to the sandbox
from computer import Computer

computer = Computer(
    os_type="macos",
    provider_type="lume",
    name="macos-sequoia-cua:latest"
)
await computer.run() # Launch & connect to the container
from computer import Computer

computer = Computer(
    os_type="windows",
    provider_type="windows_sandbox"
)
await computer.run() # Launch & connect to the container
from computer import Computer

computer = Computer(
    os_type="linux",
    provider_type="docker",
    name="trycua/cua-ubuntu:latest"
)
await computer.run() # Launch & connect to the container

Install and run cua-computer-server:

pip install cua-computer-server
python -m computer_server

Then, use the Computer object to connect:

from computer import Computer

computer = Computer(use_host_computer_server=True)
await computer.run() # Connect to the host desktop

Once connected, you can perform interactions:

try:
    # Take a screenshot of the computer's current display
    screenshot = await computer.interface.screenshot()
    # Simulate a left-click at coordinates (100, 100)
    await computer.interface.left_click(100, 100)
    # Type "Hello!" into the active application
    await computer.interface.type("Hello!")
finally:
    await computer.close()

Install the Cua computer TypeScript SDK:

npm install @trycua/computer

Then, connect to your desired computer environment:

import { Computer, OSType } from '@trycua/computer';

const computer = new Computer({
  osType: OSType.LINUX,
  name: "your-container-name",
  apiKey: "your-api-key"
});
await computer.run(); // Connect to the sandbox
import { Computer, OSType, ProviderType } from '@trycua/computer';

const computer = new Computer({
  osType: OSType.MACOS,
  providerType: ProviderType.LUME,
  name: "macos-sequoia-cua:latest"
});
await computer.run(); // Launch & connect to the container
import { Computer, OSType, ProviderType } from '@trycua/computer';

const computer = new Computer({
  osType: OSType.WINDOWS,
  providerType: ProviderType.WINDOWS_SANDBOX
});
await computer.run(); // Launch & connect to the container
import { Computer, OSType, ProviderType } from '@trycua/computer';

const computer = new Computer({
  osType: OSType.LINUX,
  providerType: ProviderType.DOCKER,
  name: "trycua/cua-ubuntu:latest"
});
await computer.run(); // Launch & connect to the container

First, install and run cua-computer-server:

pip install cua-computer-server
python -m computer_server

Then, use the Computer object to connect:

import { Computer } from '@trycua/computer';

const computer = new Computer({ useHostComputerServer: true });
await computer.run(); // Connect to the host desktop

Once connected, you can perform interactions:

try {
  // Take a screenshot of the computer's current display
  const screenshot = await computer.interface.screenshot();
  // Simulate a left-click at coordinates (100, 100)
  await computer.interface.leftClick(100, 100);
  // Type "Hello!" into the active application
  await computer.interface.typeText("Hello!");
} finally {
  await computer.close();
}

Learn more about computers in the Cua computers documentation. You will see how to automate computers with agents in the next step.

Using Agent

Utilize an Agent to automate complex tasks by providing it with a goal and allowing it to interact with the computer environment.

Install the Cua agent Python SDK:

pip install "cua-agent[all]"

Then, use the ComputerAgent object:

from agent import ComputerAgent

agent = ComputerAgent(
    model="anthropic/claude-3-5-sonnet-20241022",
    tools=[computer],
    max_trajectory_budget=5.0
)

messages = [{"role": "user", "content": "Take a screenshot and tell me what you see"}]

async for result in agent.run(messages):
    for item in result["output"]:
        if item["type"] == "message":
            print(item["content"][0]["text"])

Learn more about agents in Agent Loops and available models in Supported Models.

Next Steps