Agent Loops
Supported computer-using agent loops and models
An agent can be thought of as a loop - it generates actions, executes them, and repeats until done:
- Generate: Your
modelgeneratesoutput_text,computer_call,function_call - Execute: The
computersafely executes those items - Complete: If the model has no more calls, it's done!
To run an agent loop simply do:
from agent import ComputerAgent
import asyncio
from computer import Computer
async def take_screenshot():
async with Computer(
os_type="linux",
provider_type="cloud",
name="your-sandbox-name",
api_key="your-api-key"
) as computer:
agent = ComputerAgent(
model="anthropic/claude-sonnet-4-5-20250929",
tools=[computer],
max_trajectory_budget=5.0
)
messages = [{"role": "user", "content": "Take a screenshot and tell me what you see"}]
async for result in agent.run(messages):
for item in result["output"]:
if item["type"] == "message":
print(item["content"][0]["text"])
if __name__ == "__main__":
asyncio.run(take_screenshot())For a list of supported models and configurations, see the Supported Agents page.
Response Format
{
"output": [
{
"type": "message",
"role": "assistant",
"content": [{"type": "output_text", "text": "I can see..."}]
},
{
"type": "computer_call",
"action": {"type": "screenshot"},
"call_id": "call_123"
},
{
"type": "computer_call_output",
"call_id": "call_123",
"output": {"image_url": "data:image/png;base64,..."}
}
],
"usage": {
"prompt_tokens": 150,
"completion_tokens": 75,
"total_tokens": 225,
"response_cost": 0.01,
}
}Environment Variables
Use the following environment variables to configure the agent and its access to cloud computers and LLM providers:
# Computer instance (cloud)
export CUA_SANDBOX_NAME="your-sandbox-name"
export CUA_API_KEY="your-cua-api-key"
# LLM API keys
export ANTHROPIC_API_KEY="your-anthropic-key"
export OPENAI_API_KEY="your-openai-key"Input and output
The input prompt passed to Agent.run can either be a string or a list of message dictionaries:
messages = [
{
"role": "user",
"content": "Take a screenshot and describe what you see"
},
{
"role": "assistant",
"content": "I'll take a screenshot for you."
}
]The output is an AsyncGenerator that yields response chunks.
Parameters
The ComputerAgent constructor provides a wide range of options for customizing agent behavior, tool integration, callbacks, resource management, and more.
model(str): Default: required The LLM or agent model to use. Determines which agent loop is selected unlesscustom_loopis provided. (e.g., "claude-sonnet-4-5-20250929", "computer-use-preview", "omni+vertex_ai/gemini-pro")tools(List[Any]): List of tools the agent can use (e.g.,Computer, sandboxed Python functions, etc.).custom_loop(Callable): Optional custom agent loop function. If provided, overrides automatic loop selection.only_n_most_recent_images(int): If set, only the N most recent images are kept in the message history. Useful for limiting memory usage. Automatically addsImageRetentionCallback.callbacks(List[Any]): List of callback instances for advanced preprocessing, postprocessing, logging, or custom hooks. See Callbacks & Extensibility.verbosity(int): Logging level (e.g.,logging.INFO). If set, adds a logging callback.trajectory_dir(str): Directory path to save full trajectory data, including screenshots and responses. AddsTrajectorySaverCallback.max_retries(int): Default:3Maximum number of retries for failed API calls (default: 3).screenshot_delay(float|int): Default:0.5Delay (in seconds) before taking screenshots (default: 0.5).use_prompt_caching(bool): Default:FalseEnables prompt caching for repeated prompts (mainly for Anthropic models).max_trajectory_budget(float|dict): If set (float or dict), adds a budget manager callback that tracks usage costs and stops execution if the budget is exceeded. Dict allows advanced options (e.g.,{ "max_budget": 5.0, "raise_error": True }).instructions(str|list[str]): System instructions for the agent. Can be a single string or multiple strings in a tuple/list for readability; they are concatenated into one system prompt.api_key(str): Optional API key override for the model provider.api_base(str): Optional API base URL override for the model provider.**additional_generation_kwargs(any): Any additional keyword arguments are passed through to the agent loop or model provider.
Example with advanced options:
from agent import ComputerAgent
from computer import Computer
from agent.callbacks import ImageRetentionCallback
agent = ComputerAgent(
model="anthropic/claude-sonnet-4-5-20250929",
tools=[Computer(...)],
only_n_most_recent_images=3,
callbacks=[ImageRetentionCallback(only_n_most_recent_images=3)],
verbosity=logging.INFO,
trajectory_dir="trajectories",
max_retries=5,
screenshot_delay=1.0,
use_prompt_caching=True,
max_trajectory_budget={"max_budget": 5.0, "raise_error": True},
instructions=(
"You are a helpful computer-using agent"
"Output computer calls until you complete the given task"
),
api_key="your-api-key",
api_base="https://your-api-base.com/v1",
)Streaming Responses
async for result in agent.run(messages, stream=True):
# Process streaming chunks
for item in result["output"]:
if item["type"] == "message":
print(item["content"][0]["text"], end="", flush=True)
elif item["type"] == "computer_call":
action = item["action"]
print(f"\n[Action: {action['type']}]")Error Handling
try:
async for result in agent.run(messages):
# Process results
pass
except BudgetExceededException:
print("Budget limit exceeded")
except Exception as e:
print(f"Agent error: {e}")Was this page helpful?