Windows App behind VPN
Automate legacy Windows desktop applications behind VPN with Cua
Overview
This guide demonstrates how to automate Windows desktop applications (like eGecko HR/payroll systems) that run behind corporate VPN. This is a common enterprise scenario where legacy desktop applications require manual data entry, report generation, or workflow execution.
Use cases:
- HR/payroll processing (employee onboarding, payroll runs, benefits administration)
- Desktop ERP systems behind corporate networks
- Legacy financial applications requiring VPN access
- Compliance reporting from on-premise systems
Architecture:
- Client-side Cua agent (Python SDK or Playground UI)
- Windows VM/Sandbox with VPN client configured
- RDP/remote desktop connection to target environment
- Desktop application automation via computer vision and UI control
Production Deployment: For production use, consider workflow mining and custom finetuning to create vertical-specific actions (e.g., "Run payroll", "Onboard employee") instead of generic UI automation. This provides better audit trails and higher success rates.
Video Demo
Demo showing Cua automating an eGecko-like desktop application on Windows behind AWS VPN
Set Up Your Environment
Install the required dependencies:
Create a requirements.txt file:
cua-agent
cua-computer
python-dotenv>=1.0.0Install the dependencies:
pip install -r requirements.txtCreate a .env file with your API keys:
ANTHROPIC_API_KEY=your-anthropic-api-key
CUA_API_KEY=sk_cua-api01...
CUA_SANDBOX_NAME=your-windows-sandboxConfigure Windows Sandbox with VPN
For enterprise deployments, use Cua Cloud Sandbox with pre-configured VPN:
- Go to cua.ai/signin
- Navigate to Dashboard > Containers > Create Instance
- Create a Windows sandbox (Medium or Large for desktop apps)
- Configure VPN settings:
- Upload your AWS VPN Client configuration (
.ovpnfile) - Or configure VPN credentials directly in the dashboard
- Upload your AWS VPN Client configuration (
- Note your sandbox name and API key
Your Windows sandbox will launch with VPN automatically connected.
For local development on Windows 10 Pro/Enterprise or Windows 11:
- Enable Windows Sandbox
- Install the
pywinsandboxdependency:pip install -U git+git://github.com/karkason/pywinsandbox.git - Create a VPN setup script that runs on sandbox startup
- Configure your desktop application installation within the sandbox
Manual VPN Setup: Windows Sandbox requires manual VPN configuration each time it starts. For production use, consider Cloud Sandbox or self-hosted VMs with persistent VPN connections.
For self-managed infrastructure:
- Deploy Windows VM on your preferred cloud (AWS, Azure, GCP)
- Install and configure VPN client (AWS VPN Client, OpenVPN, etc.)
- Install target desktop application and any dependencies
- Install
cua-computer-server:pip install cua-computer-server python -m computer_server - Configure firewall rules to allow Cua agent connections
Create Your Automation Script
Create a Python file (e.g., hr_automation.py):
import asyncio
import logging
import os
from agent import ComputerAgent
from computer import Computer, VMProviderType
from dotenv import load_dotenv
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
load_dotenv()
async def automate_hr_workflow():
"""
Automate HR/payroll desktop application workflow.
This example demonstrates:
- Launching Windows desktop application
- Navigating complex desktop UI
- Data entry and form filling
- Report generation and export
"""
try:
# Connect to Windows Cloud Sandbox with VPN
async with Computer(
os_type="windows",
provider_type=VMProviderType.CLOUD,
name=os.environ["CUA_SANDBOX_NAME"],
api_key=os.environ["CUA_API_KEY"],
verbosity=logging.INFO,
) as computer:
# Configure agent with specialized instructions
agent = ComputerAgent(
model="cua/anthropic/claude-sonnet-4.5",
tools=[computer],
only_n_most_recent_images=3,
verbosity=logging.INFO,
trajectory_dir="trajectories",
use_prompt_caching=True,
max_trajectory_budget=10.0,
instructions="""
You are automating a Windows desktop HR/payroll application.
IMPORTANT GUIDELINES:
- Always wait for windows and dialogs to fully load before interacting
- Look for loading indicators and wait for them to disappear
- Verify each action by checking on-screen confirmation messages
- If a button or field is not visible, try scrolling or navigating tabs
- Desktop apps often have nested menus - explore systematically
- Save work frequently using File > Save or Ctrl+S
- Before closing, always verify changes were saved
COMMON UI PATTERNS:
- Menu bar navigation (File, Edit, View, etc.)
- Ribbon interfaces with tabs
- Modal dialogs that block interaction
- Data grids/tables for viewing records
- Form fields with validation
- Status bars showing operation progress
""".strip()
)
# Define workflow tasks
tasks = [
"Launch the HR application from the desktop or start menu",
"Log in with the credentials shown in credentials.txt on the desktop",
"Navigate to Employee Management section",
"Create a new employee record with information from new_hire.xlsx on desktop",
"Verify the employee was created successfully by searching for their name",
"Generate an onboarding report for the new employee",
"Export the report as PDF to the desktop",
"Log out of the application"
]
history = []
for task in tasks:
logger.info(f"\n{'='*60}")
logger.info(f"Task: {task}")
logger.info(f"{'='*60}\n")
history.append({"role": "user", "content": task})
async for result in agent.run(history):
for item in result.get("output", []):
if item.get("type") == "message":
content = item.get("content", [])
for block in content:
if block.get("type") == "text":
response = block.get("text", "")
logger.info(f"Agent: {response}")
history.append({"role": "assistant", "content": response})
logger.info("\nTask completed. Moving to next task...\n")
logger.info("\n" + "="*60)
logger.info("All tasks completed successfully!")
logger.info("="*60)
except Exception as e:
logger.error(f"Error during automation: {e}")
import traceback
traceback.print_exc()
if __name__ == "__main__":
asyncio.run(automate_hr_workflow())import asyncio
import logging
import os
from agent import ComputerAgent
from computer import Computer, VMProviderType
from dotenv import load_dotenv
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
load_dotenv()
async def automate_hr_workflow():
try:
# Connect to Windows Sandbox
async with Computer(
os_type="windows",
provider_type=VMProviderType.WINDOWS_SANDBOX,
verbosity=logging.INFO,
) as computer:
agent = ComputerAgent(
model="cua/anthropic/claude-sonnet-4.5",
tools=[computer],
only_n_most_recent_images=3,
verbosity=logging.INFO,
trajectory_dir="trajectories",
use_prompt_caching=True,
max_trajectory_budget=10.0,
instructions="""
You are automating a Windows desktop HR/payroll application.
IMPORTANT GUIDELINES:
- Always wait for windows and dialogs to fully load before interacting
- Verify each action by checking on-screen confirmation messages
- Desktop apps often have nested menus - explore systematically
- Save work frequently using File > Save or Ctrl+S
""".strip()
)
tasks = [
"Launch the HR application from the desktop",
"Log in with credentials from credentials.txt on desktop",
"Navigate to Employee Management and create new employee from new_hire.xlsx",
"Generate and export onboarding report as PDF",
"Log out of the application"
]
history = []
for task in tasks:
logger.info(f"\nTask: {task}")
history.append({"role": "user", "content": task})
async for result in agent.run(history):
for item in result.get("output", []):
if item.get("type") == "message":
content = item.get("content", [])
for block in content:
if block.get("type") == "text":
response = block.get("text", "")
logger.info(f"Agent: {response}")
history.append({"role": "assistant", "content": response})
logger.info("\nAll tasks completed!")
except Exception as e:
logger.error(f"Error: {e}")
import traceback
traceback.print_exc()
if __name__ == "__main__":
asyncio.run(automate_hr_workflow())import asyncio
import logging
import os
from agent import ComputerAgent
from computer import Computer
from dotenv import load_dotenv
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
load_dotenv()
async def automate_hr_workflow():
try:
# Connect to self-hosted Windows VM running computer-server
async with Computer(
use_host_computer_server=True,
base_url="http://your-windows-vm-ip:5757", # Update with your VM IP
verbosity=logging.INFO,
) as computer:
agent = ComputerAgent(
model="cua/anthropic/claude-sonnet-4.5",
tools=[computer],
only_n_most_recent_images=3,
verbosity=logging.INFO,
trajectory_dir="trajectories",
use_prompt_caching=True,
max_trajectory_budget=10.0,
instructions="""
You are automating a Windows desktop HR/payroll application.
IMPORTANT GUIDELINES:
- Always wait for windows and dialogs to fully load before interacting
- Verify each action by checking on-screen confirmation messages
- Save work frequently using File > Save or Ctrl+S
""".strip()
)
tasks = [
"Launch the HR application",
"Log in with provided credentials",
"Complete the required HR workflow",
"Generate and export report",
"Log out"
]
history = []
for task in tasks:
logger.info(f"\nTask: {task}")
history.append({"role": "user", "content": task})
async for result in agent.run(history):
for item in result.get("output", []):
if item.get("type") == "message":
content = item.get("content", [])
for block in content:
if block.get("type") == "text":
response = block.get("text", "")
logger.info(f"Agent: {response}")
history.append({"role": "assistant", "content": response})
logger.info("\nAll tasks completed!")
except Exception as e:
logger.error(f"Error: {e}")
import traceback
traceback.print_exc()
if __name__ == "__main__":
asyncio.run(automate_hr_workflow())Run Your Automation
Execute the script:
python hr_automation.pyThe agent will:
- Connect to your Windows environment (with VPN if configured)
- Launch and navigate the desktop application
- Execute each workflow step sequentially
- Verify actions and handle errors
- Save trajectory logs for audit and debugging
Monitor the console output to see the agent's progress through each task.
Key Configuration Options
Agent Instructions
The instructions parameter is critical for reliable desktop automation:
instructions="""
You are automating a Windows desktop HR/payroll application.
IMPORTANT GUIDELINES:
- Always wait for windows and dialogs to fully load before interacting
- Look for loading indicators and wait for them to disappear
- Verify each action by checking on-screen confirmation messages
- If a button or field is not visible, try scrolling or navigating tabs
- Desktop apps often have nested menus - explore systematically
- Save work frequently using File > Save or Ctrl+S
- Before closing, always verify changes were saved
COMMON UI PATTERNS:
- Menu bar navigation (File, Edit, View, etc.)
- Ribbon interfaces with tabs
- Modal dialogs that block interaction
- Data grids/tables for viewing records
- Form fields with validation
- Status bars showing operation progress
APPLICATION-SPECIFIC:
- Login is at top-left corner
- Employee records are under "HR Management" > "Employees"
- Reports are generated via "Tools" > "Reports" > "Generate"
- Always click "Save" before navigating away from a form
""".strip()Budget Management
For long-running workflows, adjust budget limits:
agent = ComputerAgent(
model="cua/anthropic/claude-sonnet-4.5",
tools=[computer],
max_trajectory_budget=20.0, # Increase for complex workflows
# ... other params
)Image Retention
Balance context and cost by retaining only recent screenshots:
agent = ComputerAgent(
# ...
only_n_most_recent_images=3, # Keep last 3 screenshots
# ...
)Production Considerations
Production Deployment
For enterprise production deployments, consider these additional steps:
1. Workflow Mining
Before deploying, analyze your actual workflows:
- Record user interactions with the application
- Identify common patterns and edge cases
- Map out decision trees and validation requirements
- Document application-specific quirks and timing issues
2. Custom Finetuning
Create vertical-specific actions instead of generic UI automation:
# Instead of generic steps:
tasks = ["Click login", "Type username", "Type password", "Click submit"]
# Create semantic actions:
tasks = ["onboard_employee", "run_payroll", "generate_compliance_report"]This provides:
- Better audit trails
- Approval gates at business logic level
- Higher success rates
- Easier maintenance and updates
3. Human-in-the-Loop
Add approval gates for critical operations:
agent = ComputerAgent(
model="cua/anthropic/claude-sonnet-4.5",
tools=[computer],
# Add human approval callback for sensitive operations
callbacks=[ApprovalCallback(require_approval_for=["payroll", "termination"])]
)4. Deployment Options
Choose your deployment model:
Managed (Recommended)
- Cua hosts Windows sandboxes, VPN/RDP stack, and agent runtime
- You get UI/API endpoints for triggering workflows
- Automatic scaling, monitoring, and maintenance
- SLA guarantees and enterprise support
Self-Hosted
- You manage Windows VMs, VPN infrastructure, and agent deployment
- Full control over data and security
- Custom network configurations
- On-premise or your preferred cloud
Troubleshooting
VPN Connection Issues
If the agent cannot reach the application:
- Verify VPN is connected: Check VPN client status in the Windows sandbox
- Test network connectivity: Try pinging internal resources
- Check firewall rules: Ensure RDP and application ports are open
- Review VPN logs: Look for authentication or routing errors
Application Not Launching
If the desktop application fails to start:
- Verify installation: Check the application is installed in the sandbox
- Check dependencies: Ensure all required DLLs and frameworks are present
- Review permissions: Application may require admin rights
- Check logs: Look for error messages in Windows Event Viewer
UI Element Not Found
If the agent cannot find buttons or fields:
- Increase wait times: Some applications load slowly
- Check screen resolution: UI elements may be off-screen
- Verify DPI scaling: High DPI settings can affect element positions
- Update instructions: Provide more specific navigation guidance
Cost Management
If costs are higher than expected:
- Reduce
max_trajectory_budget - Decrease
only_n_most_recent_images - Use prompt caching: Set
use_prompt_caching=True - Optimize task descriptions: Be more specific to reduce retry attempts
Next Steps
- Explore custom tools: Learn how to create custom tools for application-specific actions
- Implement callbacks: Add monitoring and logging for production workflows
- Join community: Get help in our Discord
Related Examples
- Form Filling - Web form automation
- Post-Event Contact Export - Data extraction workflows
- Custom Tools - Building application-specific functions
Was this page helpful?