Cua-BenchGuideFundamentals

Registry

The Cua-Bench registry of tasks and datasets.

The Cua-Bench registry contains a collection of pre-built tasks and datasets for evaluating computer-use agents.

Datasets are now part of the main cua repository at libs/cua-bench/datasets.

The goal of the registry is to be a community-based effort to increase the variety and quantity of useful, verifiable computer-use tasks.

Using the Registry

List Available Datasets

cb dataset list

Run a Dataset from the Registry

# Run with agent
cb run dataset datasets/cua-bench-basic --agent cua-agent --model anthropic/claude-sonnet-4-20250514

# Run interactively
cb interact click-button --dataset cua-bench-basic

# Run oracle solutions
cb run dataset datasets/cua-bench-basic --oracle

Clone Registry Locally

The registry is automatically cloned when you first use the --dataset flag:

# Registry location
~/.cua/cua/libs/cua-bench/datasets/

# Update datasets
cd ~/.cua/cua && git pull

Or set a custom location:

export CUA_REGISTRY_HOME=/path/to/your/datasets

Available Datasets

DatasetTasksDescriptionGolden Environment
cua-bench-basic13Basic UI interaction tasks (clicking, typing, dragging)simulated

Dataset Structure

Each dataset in the registry follows this structure:

libs/cua-bench/datasets/
└── cua-bench-basic/
    ├── meta.json           # Dataset metadata
    └── tasks/
        ├── click-button/
        │   ├── main.py     # Task implementation
        │   └── gui/
        │       └── index.html
        └── ...

meta.json Format

{
  "name": "cua-bench-basic",
  "version": "1.0.0",
  "description": "Basic UI interaction tasks",
  "platform": "simulated",
  "tasks": [
    "click-button",
    "typing-input",
    "drag-slider"
  ]
}

Contributing Tasks

To contribute tasks to the registry:

  1. Fork the cua repository
  2. Create your task in libs/cua-bench/datasets/<dataset-name>/tasks/
  3. Follow the task structure
  4. Test locally with cb interact tasks/your_task
  5. Submit a pull request to the main repository

See Creating Tasks for task development guidance.

Registry Location

By default, datasets are stored at:

  • Default: ~/.cua/cua/libs/cua-bench/datasets/
  • Custom: Set CUA_REGISTRY_HOME environment variable

The registry is a shallow clone of the main cua repository, pulling only the datasets directory.

Was this page helpful?