GuideAdvanced

Custom Agents

Register and run custom agents in cua-bench

Custom Agents

Adding an agent to the source

  1. Create cua_bench/agents/your_agent.py — extend BaseAgent, decorate with @register_agent("your-agent")
  2. Add the import to cua_bench/agents/__init__.py
  3. Run with cb run task <path> --agent your-agent

See the Build a Custom Agent example for full code.

Local registry (without modifying source)

Register an agent via .cua/agents.yaml in your project root:

agents:
  - name: my-agent
    import_path: my_package.agents:MyAgent
    defaults:
      model: claude-sonnet-4-20250514
      max_steps: 50

The import_path format is module.path:ClassName. The module must be importable from the Python environment.

Docker image agents

For production or CI, package your agent as a Docker image:

agents:
  - name: my-agent
    image: myregistry/my-agent:latest
    command: ['python', '-m', 'my_agent.main']
    defaults:
      model: gpt-4o

The agent container receives these environment variables:

VariableDescription
CUA_ENV_API_URLAPI endpoint for the environment container
CUA_ENV_VNC_URLVNC endpoint for debugging
CUA_ENV_TYPEOS type (linux, windows, android)
CUA_TASK_PATHMounted task config path (/app/env)
CUA_TASK_INDEXTask index to run
CUA_MODELModel override (if specified)
ANTHROPIC_API_KEYPassed through from host
OPENAI_API_KEYPassed through from host
GOOGLE_API_KEYPassed through from host

Development with --with

Mount and pip install local packages into the agent container without rebuilding:

cb run task tasks/my-task --agent my-agent --with ./path/to/local/package

Can be repeated for multiple packages:

cb run task tasks/my-task --agent my-agent --with ../cua-bench --with ../my-utils

Agent lookup order

When you pass --agent <name>:

  1. .cua/agents.yaml (local registry) — checked first
  2. Built-in registry (@register_agent decorators) — fallback

CLI options

cb run task <path> --agent <name>              # Use a registered agent
cb run task <path> --agent-kwarg key=value     # Pass kwargs to agent __init__
cb run task <path> --model <model>             # Override model
cb run task <path> --with <path>               # Mount local package (repeatable)

Was this page helpful?