Computer SDK API Reference

v0.5.12pip install cua-computer

Cua Computer Interface for cross-platform computer control.

Classes

Class	Description
`Computer`	Computer is the main class for interacting with the computer.
`VMProviderType`	Enum of supported VM provider types.

Name	Type	Description
`logger`	`Any`
`image`	`Any`
`host`	`Any`
`provider_port`	`Any`
`noVNC_port`	`Any`
`api_port`	`Any`
`api_host`	`Any`
`os_type`	`Any`
`provider_type`	`Any`
`ephemeral`	`Any`
`api_key`	`Any`
`timeout`	`Any`
`experiments`	`Any`
`custom_run_opts`	`Any`
`storage`	`Any`
`shared_path`	`Any`
`verbosity`	`Any`
`vm_logger`	`Any`
`interface_logger`	`Any`
`config`	`Any`
`shared_directories`	`Any`
`use_host_computer_server`	`Any`
`interface`	`Any`	Get the computer interface for interacting with the VM.
`tracing`	`ComputerTracing`	Get the computer tracing instance for recording sessions.
`telemetry_enabled`	`bool`	Check if telemetry is enabled for this computer instance.

Name	Type	Description
`apps`	`list[str]`	List of application names to include in the desktop.

If using a VM provider that supports restart, this will issue a restart without tearing down the provider context, then reconnect the interface. Falls back to stop()+run() when a provider restart is not available.

Computer.get_ip

async def get_ip(self, max_retries: int = 15, retry_delay: int = 3) -> str

Get the IP address of the VM or localhost if using host computer server.

This method delegates to the provider's get_ip method, which waits indefinitely until the VM has a valid IP address.

Parameters:

Name	Type	Description
`max_retries`	`Any`	Unused parameter, kept for backward compatibility
`retry_delay`	`Any`	Delay between retries in seconds (default: 2)

Returns: IP address of the VM or localhost if using host computer server

Computer.wait_vm_ready

async def wait_vm_ready(self) -> Optional[Dict[str, Any]]

Wait for VM to be ready with an IP address.

Returns: VM status information or None if using host computer server.

Computer.update

async def update(self, cpu: Optional[int] = None, memory: Optional[str] = None)

Update VM settings.

Computer.get_screenshot_size

def get_screenshot_size(self, screenshot: bytes) -> Dict[str, int]

Get the dimensions of a screenshot.

Parameters:

Name	Type	Description
`screenshot`	`Any`	The screenshot bytes

Returns: Dict[str, int]: Dictionary containing 'width' and 'height' of the image

Computer.to_screen_coordinates

async def to_screen_coordinates(self, x: float, y: float) -> tuple[float, float]

Convert normalized coordinates to screen coordinates.

Parameters:

Name	Type	Description
`x`	`Any`	X coordinate between 0 and 1
`y`	`Any`	Y coordinate between 0 and 1

Returns: tuple[float, float]: Screen coordinates (x, y)

Computer.to_screenshot_coordinates

async def to_screenshot_coordinates(self, x: float, y: float) -> tuple[float, float]

Convert screen coordinates to screenshot coordinates.

Parameters:

Name	Type	Description
`x`	`Any`	X coordinate in screen space
`y`	`Any`	Y coordinate in screen space

Returns: tuple[float, float]: (x, y) coordinates in screenshot space

Computer.playwright_exec

async def playwright_exec(self, command: str, params: Optional[Dict] = None) -> Dict[str, Any]

Execute a Playwright browser command.

Parameters:

Name	Type	Description
`command`	`Any`	The browser command to execute (visit_url, click, type, scroll, web_search)
`params`	`Any`	Command parameters

Returns: Dict containing the command result

Example:

# Navigate to a URL
await computer.playwright_exec("visit_url", {"url": "https://example.com"})

# Click at coordinates
await computer.playwright_exec("click", {"x": 100, "y": 200})

# Type text
await computer.playwright_exec("type", {"text": "Hello, world!"})

# Scroll
await computer.playwright_exec("scroll", {"delta_x": 0, "delta_y": -100})

# Web search
await computer.playwright_exec("web_search", {"query": "computer use agent"})

Computer.venv_install

async def venv_install(self, venv_name: str, requirements: list[str])

Install packages in a UV project.

Parameters:

Name	Type	Description
`venv_name`	`Any`	Name of the UV project
`requirements`	`Any`	List of package requirements to install

Returns: Tuple of (stdout, stderr) from the installation command

Computer.pip_install

async def pip_install(self, requirements: list[str])

Install packages using the system Python with UV (no venv).

Parameters:

Name	Type	Description
`requirements`	`Any`	List of package requirements to install globally/user site.

Returns: Tuple of (stdout, stderr) from the installation command

Computer.venv_cmd

async def venv_cmd(self, venv_name: str, command: str)

Execute a shell command in a UV project.

Parameters:

Name	Type	Description
`venv_name`	`Any`	Name of the UV project
`command`	`Any`	Shell command to execute in the UV project

Returns: Tuple of (stdout, stderr) from the command execution

Computer.venv_exec

async def venv_exec(self, venv_name: str, python_func, args = (), kwargs = {})

Execute Python function in a virtual environment using source code extraction.

Parameters:

Name	Type	Description
`venv_name`	`Any`	Name of the virtual environment
`python_func`	`Any`	A callable function to execute args: Positional arguments to pass to the function *kwargs: Keyword arguments to pass to the function

Returns: The result of the function execution, or raises any exception that occurred

Computer.venv_exec_background

async def venv_exec_background(self, venv_name: str, python_func, args = (), requirements: Optional[List[str]] = None, kwargs = {}) -> int

Run the Python function in the venv in the background and return the PID.

Uses a short launcher Python that spawns a detached child and exits immediately.

Computer.python_exec

async def python_exec(self, python_func, args = (), kwargs = {})

Execute a Python function using the system Python (no venv).

Uses source extraction and base64 transport, mirroring venv_exec but without virtual environment activation.

Returns the function result or raises a reconstructed exception with remote traceback context appended.

Computer.python_exec_background

async def python_exec_background(self, python_func, args = (), requirements: Optional[List[str]] = None, kwargs = {}) -> int

Run a Python function with the system interpreter in the background and return PID.

Uses a short launcher Python that spawns a detached child and exits immediately.

Computer.python_command

def python_command(self, requirements: Optional[List[str]] = None, venv_name: str = 'default', use_system_python: bool = False, background: bool = False) -> Callable[[Callable[P, R]], Callable[P, Awaitable[R]]]

Decorator to execute a Python function remotely in this Computer's venv.

This mirrors computer.helpers.sandboxed() but binds to this instance and optionally ensures required packages are installed before execution.

Parameters:

Name	Type	Description
`requirements`	`Any`	Packages to install in the virtual environment.
`venv_name`	`Any`	Name of the virtual environment to use.
`use_system_python`	`Any`	If True, use the system Python/pip instead of a venv.
`background`	`Any`	If True, run the function detached and return the child PID immediately.

Returns: A decorator that turns a local function into an async callable which runs remotely and returns the function's result.

VMProviderType

Inherits from: StrEnum

Enum of supported VM provider types.

Attributes

Name	Type	Description
`LUME`	`Any`
`LUMIER`	`Any`
`CLOUD`	`Any`
`CLOUDV2`	`Any`
`WINSANDBOX`	`Any`
`DOCKER`	`Any`
`UNKNOWN`	`Any`

tracing

Computer tracing functionality for recording sessions.

This module provides a Computer.tracing API inspired by Playwright's tracing functionality, allowing users to record computer interactions for debugging, training, and analysis.

ComputerTracing

Computer tracing class that records computer interactions and saves them to disk.

This class provides a flexible API for recording computer sessions with configurable options for what to record (screenshots, API calls, video, etc.).

Constructor

ComputerTracing(self, computer_instance)

Attributes

Name	Type	Description
`is_tracing`	`bool`	Check if tracing is currently active.

Methods

ComputerTracing.start

async def start(self, config: Optional[Dict[str, Any]] = None) -> None

Start tracing with the specified configuration.

Parameters:

Name	Type	Description
`config`	`Any`	Tracing configuration dict with options: - video: bool - Record video frames (default: False) - screenshots: bool - Record screenshots (default: True) - api_calls: bool - Record API calls and results (default: True) - accessibility_tree: bool - Record accessibility tree snapshots (default: False) - metadata: bool - Record custom metadata (default: True) - name: str - Custom trace name (default: auto-generated) - path: str - Custom trace directory path (default: auto-generated)

ComputerTracing.stop

async def stop(self, options: Optional[Dict[str, Any]] = None) -> str

Stop tracing and save the trace data.

Parameters:

Name	Type	Description
`options`	`Any`	Stop options dict with: - path: str - Custom output path for the trace archive - format: str - Output format ('zip' or 'dir', default: 'zip')

Returns: str: Path to the saved trace file or directory

ComputerTracing.record_api_call

async def record_api_call(self, method: str, args: Dict[str, Any], result: Any = None, error: Optional[Exception] = None) -> None

Record an API call event.

Parameters:

Name	Type	Description
`method`	`Any`	The method name that was called
`args`	`Any`	Arguments passed to the method
`result`	`Any`	Result returned by the method
`error`	`Any`	Exception raised by the method, if any

ComputerTracing.record_accessibility_tree

async def record_accessibility_tree(self) -> None

Record the current accessibility tree if enabled.

ComputerTracing.add_metadata

async def add_metadata(self, key: str, value: Any) -> None

Add custom metadata to the trace.

Parameters:

Name	Type	Description
`key`	`Any`	Metadata key
`value`	`Any`	Metadata value

models

Models for computer configuration.

BaseVMProvider

Inherits from: AsyncContextManager

Base interface for VM providers.

All VM provider implementations must implement this interface.

Attributes

Name	Type	Description
`provider_type`	`VMProviderType`	Get the provider type.

Methods

BaseVMProvider.get_vm

async def get_vm(self, name: str, storage: Optional[str] = None) -> Dict[str, Any]

Get VM information by name.

Parameters:

Name	Type	Description
`name`	`Any`	Name of the VM to get information for
`storage`	`Any`	Optional storage path override. If provided, this will be used instead of the provider's default storage path.

Returns: Dictionary with VM information including status, IP address, etc.

BaseVMProvider.list_vms

async def list_vms(self) -> ListVMsResponse

List all available VMs.

Returns: ListVMsResponse: A list of minimal VM objects as defined in computer.providers.types.MinimalVM.

BaseVMProvider.run_vm

async def run_vm(self, image: str, name: str, run_opts: Dict[str, Any], storage: Optional[str] = None) -> Dict[str, Any]

Run a VM by name with the given options.

Parameters:

Name	Type	Description
`image`	`Any`	Name/tag of the image to use
`name`	`Any`	Name of the VM to run
`run_opts`	`Any`	Dictionary of run options (memory, cpu, etc.)
`storage`	`Any`	Optional storage path override. If provided, this will be used instead of the provider's default storage path.

Returns: Dictionary with VM run status and information

BaseVMProvider.stop_vm

async def stop_vm(self, name: str, storage: Optional[str] = None) -> Dict[str, Any]

Stop a VM by name.

Parameters:

Name	Type	Description
`name`	`Any`	Name of the VM to stop
`storage`	`Any`	Optional storage path override. If provided, this will be used instead of the provider's default storage path.

Returns: Dictionary with VM stop status and information

BaseVMProvider.restart_vm

async def restart_vm(self, name: str, storage: Optional[str] = None) -> Dict[str, Any]

Restart a VM by name.

Parameters:

Name	Type	Description
`name`	`Any`	Name of the VM to restart
`storage`	`Any`	Optional storage path override. If provided, this will be used instead of the provider's default storage path.

Returns: Dictionary with VM restart status and information

BaseVMProvider.update_vm

async def update_vm(self, name: str, update_opts: Dict[str, Any], storage: Optional[str] = None) -> Dict[str, Any]

Update VM configuration.

Parameters:

Name	Type	Description
`name`	`Any`	Name of the VM to update
`update_opts`	`Any`	Dictionary of update options (memory, cpu, etc.)
`storage`	`Any`	Optional storage path override. If provided, this will be used instead of the provider's default storage path.

Returns: Dictionary with VM update status and information

BaseVMProvider.get_ip

async def get_ip(self, name: str, storage: Optional[str] = None, retry_delay: int = 2) -> str

Get the IP address of a VM, waiting indefinitely until it's available.

Parameters:

Name	Type	Description
`name`	`Any`	Name of the VM to get the IP for
`storage`	`Any`	Optional storage path override. If provided, this will be used instead of the provider's default storage path.
`retry_delay`	`Any`	Delay between retries in seconds (default: 2)

Returns: IP address of the VM when it becomes available

Display

Display configuration.

Constructor

Display(self, width: int, height: int) -> None

Attributes

Name	Type	Description
`width`	`int`
`height`	`int`

Image

VM image configuration.

Constructor

Image(self, image: str, tag: str, name: str) -> None

Attributes

Name	Type	Description
`image`	`str`
`tag`	`str`
`name`	`str`

Computer

Computer configuration.

Constructor

Computer(self, image: str, tag: str, name: str, display: Display, memory: str, cpu: str, vm_provider: Optional[BaseVMProvider] = None) -> None

Attributes

Name	Type	Description
`image`	`str`
`tag`	`str`
`name`	`str`
`display`	`Display`
`memory`	`str`
`cpu`	`str`
`vm_provider`	`Optional[BaseVMProvider]`

Methods

Computer.get_ip

async def get_ip(self) -> Optional[str]

Get the IP address of the VM.

diorama_computer

Key

Inherits from: Enum

Keyboard keys that can be used with press_key.

These key names follow a consistent cross-platform keyboard key naming convention.

Attributes

Name	Type	Description
`PAGE_DOWN`	`Any`
`PAGE_UP`	`Any`
`HOME`	`Any`
`END`	`Any`
`LEFT`	`Any`
`RIGHT`	`Any`
`UP`	`Any`
`DOWN`	`Any`
`RETURN`	`Any`
`ENTER`	`Any`
`ESCAPE`	`Any`
`ESC`	`Any`
`TAB`	`Any`
`SPACE`	`Any`
`BACKSPACE`	`Any`
`DELETE`	`Any`
`ALT`	`Any`
`CTRL`	`Any`
`SHIFT`	`Any`
`WIN`	`Any`
`COMMAND`	`Any`
`OPTION`	`Any`
`F1`	`Any`
`F2`	`Any`
`F3`	`Any`
`F4`	`Any`
`F5`	`Any`
`F6`	`Any`
`F7`	`Any`
`F8`	`Any`
`F9`	`Any`
`F10`	`Any`
`F11`	`Any`
`F12`	`Any`

Methods

Key.from_string

def from_string(cls, key: str) -> Key | str

Convert a string key name to a Key enum value.

Parameters:

Name	Type	Description
`key`	`Any`	String key name to convert

Returns: Key enum value if the string matches a known key, otherwise returns the original string for single character keys

DioramaComputer

A Computer-compatible proxy for Diorama that sends commands over the ComputerInterface.

Constructor

DioramaComputer(self, computer, apps)

Attributes

Name	Type	Description
`computer`	`Any`
`apps`	`Any`
`interface`	`Any`

Methods

DioramaComputer.run

async def run(self)

Initialize and run the DioramaComputer if not already initialized.

Returns: self: The DioramaComputer instance

DioramaComputerInterface

Diorama Interface proxy that sends diorama_cmds via the Computer's interface.

Constructor

DioramaComputerInterface(self, computer, apps)

Attributes

Name	Type	Description
`computer`	`Any`
`apps`	`Any`

Methods

DioramaComputerInterface.screenshot

async def screenshot(self, as_bytes = True)

Take a screenshot of the diorama scene.

Parameters:

Name	Type	Description
`as_bytes`	`bool`	If True, return image as bytes; if False, return PIL Image object

Returns: bytes or PIL.Image: Screenshot data in the requested format

DioramaComputerInterface.get_screen_size

async def get_screen_size(self)

Get the dimensions of the diorama scene.

Returns: dict: Dictionary containing 'width' and 'height' keys with pixel dimensions

DioramaComputerInterface.move_cursor

async def move_cursor(self, x, y)

Move the cursor to the specified coordinates.

Parameters:

Name	Type	Description
`x`	`int`	X coordinate to move cursor to
`y`	`int`	Y coordinate to move cursor to

DioramaComputerInterface.left_click

async def left_click(self, x = None, y = None)

Perform a left mouse click at the specified coordinates or current cursor position.

Parameters:

Name	Type	Description
`x`	`int, optional`	X coordinate to click at. If None, clicks at current cursor position
`y`	`int, optional`	Y coordinate to click at. If None, clicks at current cursor position

DioramaComputerInterface.right_click

async def right_click(self, x = None, y = None)

Perform a right mouse click at the specified coordinates or current cursor position.

Parameters:

Name	Type	Description
`x`	`int, optional`	X coordinate to click at. If None, clicks at current cursor position
`y`	`int, optional`	Y coordinate to click at. If None, clicks at current cursor position

DioramaComputerInterface.double_click

async def double_click(self, x = None, y = None)

Perform a double mouse click at the specified coordinates or current cursor position.

Parameters:

Name	Type	Description
`x`	`int, optional`	X coordinate to double-click at. If None, clicks at current cursor position
`y`	`int, optional`	Y coordinate to double-click at. If None, clicks at current cursor position

DioramaComputerInterface.scroll_up

async def scroll_up(self, clicks = 1)

Scroll up by the specified number of clicks.

Parameters:

Name	Type	Description
`clicks`	`int`	Number of scroll clicks to perform upward. Defaults to 1

DioramaComputerInterface.scroll_down

async def scroll_down(self, clicks = 1)

Scroll down by the specified number of clicks.

Parameters:

Name	Type	Description
`clicks`	`int`	Number of scroll clicks to perform downward. Defaults to 1

DioramaComputerInterface.drag_to

async def drag_to(self, x, y, duration = 0.5)

Drag from the current cursor position to the specified coordinates.

Parameters:

Name	Type	Description
`x`	`int`	X coordinate to drag to
`y`	`int`	Y coordinate to drag to
`duration`	`float`	Duration of the drag operation in seconds. Defaults to 0.5

DioramaComputerInterface.get_cursor_position

async def get_cursor_position(self)

Get the current cursor position.

Returns: dict: Dictionary containing the current cursor coordinates

DioramaComputerInterface.type_text

async def type_text(self, text)

Type the specified text at the current cursor position.

Parameters:

Name	Type	Description
`text`	`str`	The text to type

DioramaComputerInterface.press_key

async def press_key(self, key)

Press a single key.

Parameters:

Name	Type	Description
`key`	`Any`	The key to press

DioramaComputerInterface.hotkey

async def hotkey(self, keys = ())

Press multiple keys simultaneously as a hotkey combination.

Raises:

ValueError - If any key is not a Key enum or string type

DioramaComputerInterface.to_screen_coordinates

async def to_screen_coordinates(self, x, y)

Convert coordinates to screen coordinates.

Parameters:

Name	Type	Description
`x`	`int`	X coordinate to convert
`y`	`int`	Y coordinate to convert

Returns: dict: Dictionary containing the converted screen coordinates

helpers

Helper functions and decorators for the Computer module.

DependencyInfo

Inherits from: TypedDict

Attributes

Name	Type	Description
`import_statements`	`List[str]`
`definitions`	`List[tuple[str, Any]]`

set_default_computer

def set_default_computer(computer: Any) -> None

Set the default computer instance to be used by the remote decorator.

Parameters:

Name	Type	Description
`computer`	`Any`	The computer instance to use as default

sandboxed

def sandboxed(venv_name: str = 'default', computer: str = 'default', max_retries: int = 3) -> Callable[[Callable[P, R]], Callable[P, Awaitable[R]]]

Decorator that wraps a function to be executed remotely via computer.venv_exec

The function is automatically analyzed for dependencies (imports, helper functions, constants, etc.) and reconstructed with all necessary code in the remote sandbox.

Parameters:

Name	Type	Description
`venv_name`	`Any`	Name of the virtual environment to execute in
`computer`	`Any`	The computer instance to use, or "default" to use the globally set default
`max_retries`	`Any`	Maximum number of retries for the remote execution

generate_source_code

def generate_source_code(func: FunctionType) -> str

Generate complete source code for a function with all dependencies.

Parameters:

Name	Type	Description
`func`	`Any`	The function to generate source code for

Returns: Complete Python source code as a string

interface

Interface package for Computer SDK.

BaseComputerInterface

Inherits from: ABC

Base class for computer control interfaces.

Constructor

BaseComputerInterface(self, ip_address: str, username: str = 'lume', password: str = 'lume', api_key: Optional[str] = None, vm_name: Optional[str] = None)

Attributes

Name	Type	Description
`ip_address`	`Any`
`username`	`Any`
`password`	`Any`
`api_key`	`Any`
`vm_name`	`Any`
`logger`	`Any`
`delay`	`float`

Methods

BaseComputerInterface.wait_for_ready

async def wait_for_ready(self, timeout: int = 60) -> None

Wait for interface to be ready.

Parameters:

Name	Type	Description
`timeout`	`Any`	Maximum time to wait in seconds

Raises:

TimeoutError - If interface is not ready within timeout

BaseComputerInterface.close

def close(self) -> None

Close the interface connection.

BaseComputerInterface.force_close

def force_close(self) -> None

Force close the interface connection.

By default, this just calls close(), but subclasses can override to provide more forceful cleanup.

BaseComputerInterface.mouse_down

async def mouse_down(self, x: Optional[int] = None, y: Optional[int] = None, button: MouseButton = 'left', delay: Optional[float] = None) -> None

Press and hold a mouse button.

Parameters:

Name	Type	Description
`x`	`Any`	X coordinate to press at. If None, uses current cursor position.
`y`	`Any`	Y coordinate to press at. If None, uses current cursor position.
`button`	`Any`	Mouse button to press ('left', 'middle', 'right').
`delay`	`Any`	Optional delay in seconds after the action

BaseComputerInterface.mouse_up

async def mouse_up(self, x: Optional[int] = None, y: Optional[int] = None, button: MouseButton = 'left', delay: Optional[float] = None) -> None

Release a mouse button.

Parameters:

Name	Type	Description
`x`	`Any`	X coordinate to release at. If None, uses current cursor position.
`y`	`Any`	Y coordinate to release at. If None, uses current cursor position.
`button`	`Any`	Mouse button to release ('left', 'middle', 'right').
`delay`	`Any`	Optional delay in seconds after the action

BaseComputerInterface.left_click

async def left_click(self, x: Optional[int] = None, y: Optional[int] = None, delay: Optional[float] = None) -> None

Perform a left mouse button click.

Parameters:

Name	Type	Description
`x`	`Any`	X coordinate to click at. If None, uses current cursor position.
`y`	`Any`	Y coordinate to click at. If None, uses current cursor position.
`delay`	`Any`	Optional delay in seconds after the action

BaseComputerInterface.right_click

async def right_click(self, x: Optional[int] = None, y: Optional[int] = None, delay: Optional[float] = None) -> None

Perform a right mouse button click.

Parameters:

Name	Type	Description
`x`	`Any`	X coordinate to click at. If None, uses current cursor position.
`y`	`Any`	Y coordinate to click at. If None, uses current cursor position.
`delay`	`Any`	Optional delay in seconds after the action

BaseComputerInterface.double_click

async def double_click(self, x: Optional[int] = None, y: Optional[int] = None, delay: Optional[float] = None) -> None

Perform a double left mouse button click.

Parameters:

Name	Type	Description
`x`	`Any`	X coordinate to double-click at. If None, uses current cursor position.
`y`	`Any`	Y coordinate to double-click at. If None, uses current cursor position.
`delay`	`Any`	Optional delay in seconds after the action

BaseComputerInterface.move_cursor

async def move_cursor(self, x: int, y: int, delay: Optional[float] = None) -> None

Move the cursor to the specified screen coordinates.

Parameters:

Name	Type	Description
`x`	`Any`	X coordinate to move cursor to.
`y`	`Any`	Y coordinate to move cursor to.
`delay`	`Any`	Optional delay in seconds after the action

BaseComputerInterface.drag_to

async def drag_to(self, x: int, y: int, button: str = 'left', duration: float = 0.5, delay: Optional[float] = None) -> None

Drag from current position to specified coordinates.

Parameters:

Name	Type	Description
`x`	`Any`	The x coordinate to drag to
`y`	`Any`	The y coordinate to drag to
`button`	`Any`	The mouse button to use ('left', 'middle', 'right')
`duration`	`Any`	How long the drag should take in seconds
`delay`	`Any`	Optional delay in seconds after the action

BaseComputerInterface.drag

async def drag(self, path: List[Tuple[int, int]], button: str = 'left', duration: float = 0.5, delay: Optional[float] = None) -> None

Drag the cursor along a path of coordinates.

Parameters:

Name	Type	Description
`path`	`Any`	List of (x, y) coordinate tuples defining the drag path
`button`	`Any`	The mouse button to use ('left', 'middle', 'right')
`duration`	`Any`	Total time in seconds that the drag operation should take
`delay`	`Any`	Optional delay in seconds after the action

BaseComputerInterface.key_down

async def key_down(self, key: str, delay: Optional[float] = None) -> None

Press and hold a key.

Parameters:

Name	Type	Description
`key`	`Any`	The key to press and hold (e.g., 'a', 'shift', 'ctrl').
`delay`	`Any`	Optional delay in seconds after the action.

BaseComputerInterface.key_up

async def key_up(self, key: str, delay: Optional[float] = None) -> None

Release a previously pressed key.

Parameters:

Name	Type	Description
`key`	`Any`	The key to release (e.g., 'a', 'shift', 'ctrl').
`delay`	`Any`	Optional delay in seconds after the action.

BaseComputerInterface.type_text

async def type_text(self, text: str, delay: Optional[float] = None) -> None

Type the specified text string.

Parameters:

Name	Type	Description
`text`	`Any`	The text string to type.
`delay`	`Any`	Optional delay in seconds after the action.

BaseComputerInterface.press_key

async def press_key(self, key: str, delay: Optional[float] = None) -> None

Press and release a single key.

Parameters:

Name	Type	Description
`key`	`Any`	The key to press (e.g., 'a', 'enter', 'escape').
`delay`	`Any`	Optional delay in seconds after the action.

BaseComputerInterface.hotkey

async def hotkey(self, keys: str = (), delay: Optional[float] = None) -> None

Press multiple keys simultaneously (keyboard shortcut).

Parameters:

Name	Type	Description
`delay`	`Any`	Optional delay in seconds after the action.

BaseComputerInterface.scroll

async def scroll(self, x: int, y: int, delay: Optional[float] = None) -> None

Scroll the mouse wheel by specified amounts.

Parameters:

Name	Type	Description
`x`	`Any`	Horizontal scroll amount (positive = right, negative = left).
`y`	`Any`	Vertical scroll amount (positive = up, negative = down).
`delay`	`Any`	Optional delay in seconds after the action.

BaseComputerInterface.scroll_down

async def scroll_down(self, clicks: int = 1, delay: Optional[float] = None) -> None

Scroll down by the specified number of clicks.

Parameters:

Name	Type	Description
`clicks`	`Any`	Number of scroll clicks to perform downward.
`delay`	`Any`	Optional delay in seconds after the action.

BaseComputerInterface.scroll_up

async def scroll_up(self, clicks: int = 1, delay: Optional[float] = None) -> None

Scroll up by the specified number of clicks.

Parameters:

Name	Type	Description
`clicks`	`Any`	Number of scroll clicks to perform upward.
`delay`	`Any`	Optional delay in seconds after the action.

BaseComputerInterface.screenshot

async def screenshot(self) -> bytes

Take a screenshot.

Returns: Raw bytes of the screenshot image

BaseComputerInterface.get_screen_size

async def get_screen_size(self) -> Dict[str, int]

Get the screen dimensions.

Returns: Dict with 'width' and 'height' keys

BaseComputerInterface.get_cursor_position

async def get_cursor_position(self) -> Dict[str, int]

Get the current cursor position on screen.

Returns: Dict with 'x' and 'y' keys containing cursor coordinates.

BaseComputerInterface.copy_to_clipboard

async def copy_to_clipboard(self) -> str

Get the current clipboard content.

Returns: The text content currently stored in the clipboard.

BaseComputerInterface.set_clipboard

async def set_clipboard(self, text: str) -> None

Set the clipboard content to the specified text.

Parameters:

Name	Type	Description
`text`	`Any`	The text to store in the clipboard.

BaseComputerInterface.file_exists

async def file_exists(self, path: str) -> bool

Check if a file exists at the specified path.

Parameters:

Name	Type	Description
`path`	`Any`	The file path to check.

Returns: True if the file exists, False otherwise.

BaseComputerInterface.directory_exists

async def directory_exists(self, path: str) -> bool

Check if a directory exists at the specified path.

Parameters:

Name	Type	Description
`path`	`Any`	The directory path to check.

Returns: True if the directory exists, False otherwise.

BaseComputerInterface.list_dir

async def list_dir(self, path: str) -> List[str]

List the contents of a directory.

Parameters:

Name	Type	Description
`path`	`Any`	The directory path to list.

Returns: List of file and directory names in the specified directory.

BaseComputerInterface.read_text

async def read_text(self, path: str) -> str

Read the text contents of a file.

Parameters:

Name	Type	Description
`path`	`Any`	The file path to read from.

Returns: The text content of the file.

BaseComputerInterface.write_text

async def write_text(self, path: str, content: str) -> None

Write text content to a file.

Parameters:

Name	Type	Description
`path`	`Any`	The file path to write to.
`content`	`Any`	The text content to write.

BaseComputerInterface.read_bytes

async def read_bytes(self, path: str, offset: int = 0, length: Optional[int] = None) -> bytes

Read file binary contents with optional seeking support.

Parameters:

Name	Type	Description
`path`	`Any`	Path to the file
`offset`	`Any`	Byte offset to start reading from (default: 0)
`length`	`Any`	Number of bytes to read (default: None for entire file)

BaseComputerInterface.write_bytes

async def write_bytes(self, path: str, content: bytes) -> None

Write binary content to a file.

Parameters:

Name	Type	Description
`path`	`Any`	The file path to write to.
`content`	`Any`	The binary content to write.

BaseComputerInterface.delete_file

async def delete_file(self, path: str) -> None

Delete a file at the specified path.

Parameters:

Name	Type	Description
`path`	`Any`	The file path to delete.

BaseComputerInterface.create_dir

async def create_dir(self, path: str) -> None

Create a directory at the specified path.

Parameters:

Name	Type	Description
`path`	`Any`	The directory path to create.

BaseComputerInterface.delete_dir

async def delete_dir(self, path: str) -> None

Delete a directory at the specified path.

Parameters:

Name	Type	Description
`path`	`Any`	The directory path to delete.

BaseComputerInterface.get_file_size

async def get_file_size(self, path: str) -> int

Get the size of a file in bytes.

Parameters:

Name	Type	Description
`path`	`Any`	The file path to get the size of.

Returns: The size of the file in bytes.

BaseComputerInterface.get_desktop_environment

async def get_desktop_environment(self) -> str

Get the current desktop environment.

Returns: The name of the current desktop environment.

BaseComputerInterface.set_wallpaper

async def set_wallpaper(self, path: str) -> None

Set the desktop wallpaper to the specified path.

Parameters:

Name	Type	Description
`path`	`Any`	The file path to set as wallpaper

BaseComputerInterface.open

async def open(self, target: str) -> None

Open a target using the system's default handler.

Typically opens files, folders, or URLs with the associated application.

Parameters:

Name	Type	Description
`target`	`Any`	The file path, folder path, or URL to open.

BaseComputerInterface.launch

async def launch(self, app: str, args: List[str] | None = None) -> Optional[int]

Launch an application with optional arguments.

Parameters:

Name	Type	Description
`app`	`Any`	The application executable or bundle identifier.
`args`	`Any`	Optional list of arguments to pass to the application.

Returns: Optional process ID (PID) of the launched application if available, otherwise None.

BaseComputerInterface.get_current_window_id

async def get_current_window_id(self) -> int | str

Get the identifier of the currently active/focused window.

Returns: A window identifier that can be used with other window management methods.

BaseComputerInterface.get_application_windows

async def get_application_windows(self, app: str) -> List[int | str]

Get all window identifiers for a specific application.

Parameters:

Name	Type	Description
`app`	`Any`	The application name, executable, or identifier to query.

Returns: A list of window identifiers belonging to the specified application.

BaseComputerInterface.get_window_name

async def get_window_name(self, window_id: int | str) -> str

Get the title/name of a window.

Parameters:

Name	Type	Description
`window_id`	`Any`	The window identifier.

Returns: The window's title or name string.

BaseComputerInterface.get_window_size

async def get_window_size(self, window_id: int | str) -> tuple[int, int]

Get the size of a window in pixels.

Parameters:

Name	Type	Description
`window_id`	`Any`	The window identifier.

Returns: A tuple of (width, height) representing the window size in pixels.

BaseComputerInterface.get_window_position

async def get_window_position(self, window_id: int | str) -> tuple[int, int]

Get the screen position of a window.

Parameters:

Name	Type	Description
`window_id`	`Any`	The window identifier.

Returns: A tuple of (x, y) representing the window's top-left corner in screen coordinates.

BaseComputerInterface.set_window_size

async def set_window_size(self, window_id: int | str, width: int, height: int) -> None

Set the size of a window in pixels.

Parameters:

Name	Type	Description
`window_id`	`Any`	The window identifier.
`width`	`Any`	Desired width in pixels.
`height`	`Any`	Desired height in pixels.

BaseComputerInterface.set_window_position

async def set_window_position(self, window_id: int | str, x: int, y: int) -> None

Move a window to a specific position on the screen.

Parameters:

Name	Type	Description
`window_id`	`Any`	The window identifier.
`x`	`Any`	X coordinate for the window's top-left corner.
`y`	`Any`	Y coordinate for the window's top-left corner.

BaseComputerInterface.maximize_window

async def maximize_window(self, window_id: int | str) -> None

Maximize a window.

Parameters:

Name	Type	Description
`window_id`	`Any`	The window identifier.

BaseComputerInterface.minimize_window

async def minimize_window(self, window_id: int | str) -> None

Minimize a window.

Parameters:

Name	Type	Description
`window_id`	`Any`	The window identifier.

BaseComputerInterface.activate_window

async def activate_window(self, window_id: int | str) -> None

Bring a window to the foreground and focus it.

Parameters:

Name	Type	Description
`window_id`	`Any`	The window identifier.

BaseComputerInterface.close_window

async def close_window(self, window_id: int | str) -> None

Close a window.

Parameters:

Name	Type	Description
`window_id`	`Any`	The window identifier.

BaseComputerInterface.get_window_title

async def get_window_title(self, window_id: int | str) -> str

Convenience alias for get_window_name().

Parameters:

Name	Type	Description
`window_id`	`Any`	The window identifier.

Returns: The window's title or name string.

BaseComputerInterface.window_size

async def window_size(self, window_id: int | str) -> tuple[int, int]

Convenience alias for get_window_size().

Parameters:

Name	Type	Description
`window_id`	`Any`	The window identifier.

Returns: A tuple of (width, height) representing the window size in pixels.

BaseComputerInterface.run_command

async def run_command(self, command: str) -> CommandResult

Run shell command and return structured result.

Executes a shell command using subprocess.run with shell=True and check=False. The command is run in the target environment and captures both stdout and stderr.

Parameters:

Name	Type	Description
`command`	`str`	The shell command to execute

Returns: CommandResult: A structured result containing: - stdout (str): Standard output from the command - stderr (str): Standard error from the command - returncode (int): Exit code from the command (0 indicates success)

Raises:

RuntimeError - If the command execution fails at the system level

Example:

result = await interface.run_command("ls -la")
if result.returncode == 0:
    print(f"Output: {result.stdout}")
else:
    print(f"Error: {result.stderr}, Exit code: {result.returncode}")

BaseComputerInterface.get_accessibility_tree

async def get_accessibility_tree(self) -> Dict

Get the accessibility tree of the current screen.

Returns: Dict containing the hierarchical accessibility information of screen elements.

BaseComputerInterface.to_screen_coordinates

async def to_screen_coordinates(self, x: float, y: float) -> tuple[float, float]

Convert screenshot coordinates to screen coordinates.

Parameters:

Name	Type	Description
`x`	`Any`	X coordinate in screenshot space
`y`	`Any`	Y coordinate in screenshot space

Returns: tuple[float, float]: (x, y) coordinates in screen space

BaseComputerInterface.to_screenshot_coordinates

async def to_screenshot_coordinates(self, x: float, y: float) -> tuple[float, float]

Convert screen coordinates to screenshot coordinates.

Parameters:

Name	Type	Description
`x`	`Any`	X coordinate in screen space
`y`	`Any`	Y coordinate in screen space

Returns: tuple[float, float]: (x, y) coordinates in screenshot space

def create_interface_for_os(os: OSType, ip_address: str, api_port: Optional[int] = None, api_key: Optional[str] = None, vm_name: Optional[str] = None) -> BaseComputerInterface

Create an interface for the specified OS.

Parameters:

Name	Type	Description
`os`	`Any`	Operating system type ('macos', 'linux', or 'windows')
`ip_address`	`Any`	IP address of the computer to control
`api_port`	`Any`	Optional API port of the computer to control
`api_key`	`Any`	Optional API key for cloud authentication
`vm_name`	`Any`	Optional VM name for cloud authentication

Returns: BaseComputerInterface: The appropriate interface for the OS

Raises:

ValueError - If the OS type is not supported

MacOSComputerInterface

Inherits from: GenericComputerInterface

Interface for macOS.

Constructor

MacOSComputerInterface(self, ip_address: str, username: str = 'lume', password: str = 'lume', api_key: Optional[str] = None, vm_name: Optional[str] = None, api_port: Optional[int] = None)

Methods

MacOSComputerInterface.diorama_cmd

async def diorama_cmd(self, action: str, arguments: Optional[dict] = None) -> dict

Send a diorama command to the server (macOS only).

Was this page helpful?