Cua Docs

Drive your first app

Install Cua Driver, connect your agent, and ask it in plain English to drive a desktop app in the background on macOS, Windows, or Linux.

Drive your first app

You do not type Cua Driver CLI commands by hand. Cua Driver is a backend that an agent harness (Claude Code, Codex, Hermes, and others) drives for you. In this tutorial you install Cua Driver, connect your agent, and ask it in plain English to open a calculator and read a result back. Works the same on macOS, Windows, and Linux.

1. Install Cua Driver

Use the same one-line installer on every platform; it picks the right path for the host and needs no administrator access.

Requires macOS 14 (Sonoma) or later.

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/cua-driver/scripts/install.sh)"

Start the daemon through the app bundle so macOS attributes permission prompts to CuaDriver.app:

open -n -g -a CuaDriver --args serve

Then grant Accessibility and Screen Recording:

cua-driver permissions grant

Full install steps, PATH setup, and permission details live in the install guide. See Install Cua Driver.

2. Verify it is working

Before you hand control to an agent, confirm the driver can see your desktop. Run the cross-platform probe:

cua-driver doctor

Then list the running GUI apps the driver can reach:

cua-driver call list_apps

If you see your running apps listed, the plumbing works. That is the only CLI you run by hand; from here the agent does the driving.

3. Connect your agent

Register Cua Driver with your agent harness once.

Install the Claude Code skill. This is turnkey: the skill teaches Claude Code how to drive apps through Cua Driver, and you then ask in plain English.

cua-driver skills install

Prefer plain MCP instead of the skill? Generate the registration command:

cua-driver mcp-config --client claude

Using Cursor, Gemini/Antigravity, OpenCode, OpenClaw, or Pi instead? See Connect your agent.

4. Ask your agent to drive an app

Now ask your agent in plain English. Type this prompt to Claude Code, Codex, or Hermes:

Open the calculator and compute 17 x 23, then tell me the answer.

The agent uses Cua Driver to launch the calculator (Calculator on macOS and Windows, the system calculator on Linux), read the window's accessibility tree, click the number and operator keys, and read the result back from the display. All of it runs in the background. The agent picks the right calculator app for your OS, so the same prompt works on all three platforms.

5. Confirm what happened

The agent reports 391. Cua Driver kept the calculator in the background the whole time. It never stole your keyboard focus and never moved your cursor. That is the no-foreground contract: the agent drives the app while you keep working. See The no-foreground contract.

What you did

You installed Cua Driver, verified it could see your desktop, connected your agent harness, and asked in plain English for a result computed inside a real desktop app. The agent did the driving through Cua Driver, in the background, on your platform.

Next steps