Cua-BenchGuideFundamentals

Simulated Desktop (Preview)

Preview feature for testing tasks with themed desktop environments (no Docker required)

This is a preview feature for rapid development and testing.

When you set provider="simulated" in your task config, you can preview it in a lightweight HTML/CSS desktop environment without Docker. This renders OS-themed window chrome, taskbars, and desktop styling around your GUI.

Available Themes

ThemeStyle
win11Windows 11 with rounded corners and centered taskbar
macosmacOS with top menu bar and dock

Usage

Set provider: "simulated" and choose a theme:

import cua_bench as cb

@cb.tasks_config(split="train")
def load():
    return [
        cb.Task(
            description='Click the Submit button',
            computer={
                "provider": "simulated",  # Enables simulated desktop
                "setup_config": {
                    "os_type": "win11",  # win11 or macos
                    "width": 1920,
                    "height": 1080,
                }
            }
        )
    ]

Then preview:

cb interact tasks/my_task --variant-id 0

Multiple Theme Variants

Generate tasks across different desktop themes:

@cb.tasks_config(split="train")
def load():
    themes = ["win11", "macos"]

    return [
        cb.Task(
            description=f'Complete the form',
            computer={
                "provider": "simulated",
                "setup_config": {
                    "os_type": theme,
                    "width": 1280,
                    "height": 800,
                }
            }
        )
        for theme in themes
    ]

Limitations

Simulated desktop runs in a browser, so it has some restrictions:

  • No shell actions: Can't execute bash commands or scripts
  • No native apps: Can't install or run real applications
  • HTML/CSS only: Only works with custom GUIs created via launch_window() or install_app()

For tasks requiring shell access or native applications, use Docker or QEMU environments instead.

Why Use Simulated Desktop?

  • No Docker: Runs directly in browser with Playwright
  • Fast: Instant startup, no VM boot time
  • Visual Diversity: Train agents on different OS appearances
  • Deterministic: Same theme always renders identically
  • Lightweight: Run thousands in parallel

Use simulated desktop for quick iteration and agent development. Switch to Docker/QEMU when you need real OS environments.

Was this page helpful?


On this page