Tracing API

The Tracing API records everything that happens on a Computer—screenshots, API calls, accessibility trees, and custom metadata. Unlike Trajectories which are specific to ComputerAgent, tracing works with any Computer usage: custom agents, scripts, human-in-the-loop workflows, or RPA.

Basic Usage

Start and stop tracing around any computer operations:

from computer import Computer

computer = Computer(os_type="linux", provider_type="docker", image="trycua/cua-xfce:latest")
await computer.run()

# Start tracing
await computer.tracing.start()

# Perform operations (these get recorded)
await computer.interface.left_click(100, 200)
await computer.interface.type_text("Hello, World!")
screenshot = await computer.interface.screenshot()

# Stop and save trace
trace_path = await computer.tracing.stop()
print(f"Trace saved to: {trace_path}")

This creates a ZIP archive containing all recorded events, screenshots, and metadata.

What Gets Recorded

By default, tracing captures:

Screenshots - Taken automatically after key actions (clicks, typing, key presses)
API calls - Every method called on computer.interface with arguments and results
Timestamps - Absolute and relative timing for each event
Errors - Any exceptions that occur during operations

You can also enable:

Accessibility trees - Semantic structure of the UI (useful for debugging element detection)
Custom metadata - Your own key-value data added during the session

Configuration

Control what gets recorded with start options:

await computer.tracing.start({
    "name": "my-workflow",           # Custom trace name
    "screenshots": True,             # Capture screenshots (default: True)
    "api_calls": True,               # Log interface calls (default: True)
    "accessibility_tree": False,     # Record a11y trees (default: False)
    "metadata": True,                # Enable custom metadata (default: True)
})

Control output format when stopping:

# Save as ZIP archive (default)
trace_path = await computer.tracing.stop({"format": "zip"})

# Save as directory (easier to browse)
trace_path = await computer.tracing.stop({"format": "dir"})

# Custom output path
trace_path = await computer.tracing.stop({"path": "/tmp/my-trace.zip"})

Output Format

Traces are saved with this structure:

trace_20250103_143052_abc123/
├── trace_metadata.json           # Overall trace info
├── event_000001_trace_start.json
├── event_000002_api_call.json
├── event_000003_api_call.json
├── 000001_initial_screenshot.png
├── 000002_after_left_click.png
├── 000003_after_type_text.png
└── event_000004_trace_end.json

trace_metadata.json:

{
  "trace_id": "trace_20250103_143052_abc123",
  "start_time": "2025-01-03T14:30:52Z",
  "end_time": "2025-01-03T14:31:15Z",
  "duration": 23.456,
  "total_events": 12,
  "screenshot_count": 5,
  "config": {
    "screenshots": true,
    "api_calls": true
  }
}

API call event:

{
  "type": "api_call",
  "timestamp": 1704295852.789,
  "relative_time": 3.666,
  "data": {
    "method": "left_click",
    "args": { "x": 100, "y": 200 },
    "result": null,
    "error": null,
    "screenshot": "000002_after_left_click.png",
    "success": true
  }
}

Adding Custom Metadata

Add context during tracing for later analysis:

await computer.tracing.start()

# Add metadata at any point
await computer.tracing.add_metadata("workflow", "login-flow")
await computer.tracing.add_metadata("user_id", "test-user-123")
await computer.tracing.add_metadata("step", "entering-credentials")

await computer.interface.type_text("username")

await computer.tracing.add_metadata("step", "clicking-submit")
await computer.interface.left_click(500, 300)

trace_path = await computer.tracing.stop()

Metadata appears in the trace alongside events, making it easy to correlate actions with your application context.

Tracing with ComputerAgent

Tracing works alongside ComputerAgent—you get both agent-level information and detailed computer interaction logs:

from computer import Computer
from agent import ComputerAgent

computer = Computer(os_type="linux", provider_type="docker", image="trycua/cua-xfce:latest")
await computer.run()

# Start tracing before agent runs
await computer.tracing.start({"name": "agent-task"})

agent = ComputerAgent(
    model="anthropic/claude-sonnet-4-5-20250929",
    tools=[computer],
    trajectory_dir="trajectories"  # Can use both!
)

async for result in agent.run("Open Firefox and search for Cua"):
    pass

# Stop tracing after agent completes
trace_path = await computer.tracing.stop()

Now you have:

Trajectory - Agent's reasoning, API calls to the model, action decisions
Trace - Low-level computer interactions, screenshots, timing

Tracing vs Trajectories

Feature	Tracing API	Trajectories
Works with	Any Computer usage	ComputerAgent only
Records	Computer interface calls	Agent reasoning + actions
Screenshots	After each action	Per agent turn
Accessibility data	Optional	No
Custom metadata	Yes	No
Output format	ZIP or directory	Directory with turns
Real-time control	Start/stop anytime	Per agent run

Use Tracing when:

Building custom agents or scripts
Recording human demonstrations
Debugging computer interface issues
Need fine-grained timing data
Want accessibility tree snapshots

Use Trajectories when:

Using ComputerAgent
Need agent reasoning and model responses
Want turn-by-turn organization
Using the trajectory viewer

Recording Human Demonstrations

Tracing is ideal for capturing human-driven workflows as training data:

computer = Computer(os_type="linux", provider_type="docker", image="trycua/cua-xfce:latest")
await computer.run()

await computer.tracing.start({
    "name": "human-demo-login",
    "accessibility_tree": True  # Capture UI structure
})

# Human performs actions via VNC or other interface
# Meanwhile, your script can add context:
await computer.tracing.add_metadata("task", "login-to-dashboard")
await computer.tracing.add_metadata("demonstrator", "expert-user-1")

# Wait for human to finish...
input("Press Enter when done with demonstration")

trace_path = await computer.tracing.stop()
print(f"Demo saved to: {trace_path}")

Privacy Considerations

Tracing is designed with privacy in mind:

Clipboard - Only the length is recorded, not the actual content
Screenshots - Can be disabled entirely with screenshots: False
Passwords - Text input is recorded, so consider disabling for sensitive workflows

# Privacy-conscious tracing
await computer.tracing.start({
    "screenshots": False,  # No visual data
    "api_calls": True      # Still log actions
})

Debugging with Traces

When something goes wrong, traces show exactly what happened:

Open trace_metadata.json to see the overview
Check event files in order to find where things went wrong
Look at screenshots to see what the screen looked like
Check error fields in API call events for exceptions

import json
from pathlib import Path

# Load a trace
trace_dir = Path("trace_20250103_143052_abc123")

with open(trace_dir / "trace_metadata.json") as f:
    metadata = json.load(f)

print(f"Duration: {metadata['duration']:.2f}s")
print(f"Events: {metadata['total_events']}")
print(f"Screenshots: {metadata['screenshot_count']}")

# Find failed API calls
for event_file in sorted(trace_dir.glob("event_*.json")):
    with open(event_file) as f:
        event = json.load(f)
    if event.get("data", {}).get("error"):
        print(f"Error in {event_file.name}: {event['data']['error']}")

Was this page helpful?

Tracing API

On this page