computeruse-rs 2.0.0

A Playwright-style SDK for automating desktop GUI applications
docs.rs failed to build computeruse-rs-2.0.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

Demo

🤖 Computer Use MCP that controls your entire desktop

Give AI assistants (Claude, Cursor, VS Code, etc.) the ability to control your desktop and automate tasks across any application.

Claude Code (one-liner):

claude mcp add computeruse "npx -y computeruse-mcp-agent@latest"

Other clients (Cursor, VS Code, Windsurf, etc.):

Add to your MCP config file:

{
  "mcpServers": {
    "computeruse-mcp-agent": {
      "command": "npx",
      "args": ["-y", "computeruse-mcp-agent@latest"],
      "env": {
        "LOG_LEVEL": "info",
        "RUST_BACKTRACE": "1"
      }
    }
  }
}

See the MCP Agent README for detailed setup instructions.

Why ComputerUse MCP?

  • Uses your browser session - no need to relogin, keeps all your cookies and auth
  • Doesn't take over your cursor or keyboard - runs in the background without interrupting your work
  • Works across all dimensions - pixels, DOM, and Accessibility tree for maximum reliability

Use Cases

  • Create a new instance on GCP, connect to it using CLI
  • Check logs on Vercel to find most common errors
  • Test my app new features based on recent commits

🚀 What's new

🧠 Why ComputerUse

For Developers

  • Create automations that work across any desktop app or browser
  • Runs 100x faster than ChatGPT Agents, Claude, Perplexity Comet, BrowserBase, BrowserUse (deterministic, CPU speed, with AI recovery)
  • 95% success rate unlike most computer use overhyped products

  • MIT-licensed — fork it, ship it, no lock-in

We achieve this by pre-training workflows as deterministic code, and calling AI only when recovery is needed.

For Teams

Our public beta workflow builder + managed hosting:

  • Record, map your processes, and implement the workflow without technical skills
  • Deploy AI to execute them at >95% success rate without managing hundreds of Windows VMs
  • Kill repetitive work without legacy RPA complexity, implementation and maintenance cost

Feature Support

ComputerUse supports Windows, macOS, and Linux.

Feature Windows macOS Linux Notes
Core Automation
Element Locators Find elements by name, role, window, etc.
UI Actions (click, type) Core interactions with UI elements.
Application Management Launch, list, and manage applications.
Window Management Get active window, list windows.
Advanced Features
Browser Automation Chrome extension enables browser control.
Workflow Recording 🟡 🟡 Record human workflows for deterministic automation.
Monitor Management Multi-display support.
Screen & Element Capture Take screenshots of displays or elements.
Libraries
Python (computeruse.py) 🟡 🟡 🟡 pip install computeruse
TypeScript (@elizaos/computeruse) npm i @elizaos/computeruse
Workflow (@mediar-ai/workflow) 🟡 🟡 npm i @mediar-ai/workflow
CLI (@elizaos/cli) npm i @elizaos/cli
KV (@mediar-ai/kv) npm i @mediar-ai/kv
MCP (computeruse-mcp-agent) npx -y computeruse-mcp-agent --add-to-app [app]
Rust (computeruse-rs) cargo add computeruse-rs

Legend:

  • ✅: Supported - The feature is stable and well-tested.
  • 🟡: Partial / Experimental - The feature is in development and may have limitations.
  • ❌: Not Supported - Not available on this platform.

Platform Notes:

  • macOS: Requires Accessibility permissions (System Preferences → Privacy & Security → Accessibility)
  • Linux: Requires AT-SPI2 (enabled by default on GNOME/KDE). For X11, install wmctrl and xdotool.

🕵️ How to Inspect Accessibility Elements (like name:Seven)

To create reliable selectors (e.g. name:Seven, role:Button, window:Calculator), you need to inspect the Windows Accessibility Tree:

Windows

  • Tool: Accessibility Insights for Windows
  • Alt: Inspect.exe (comes with Windows SDK)
  • Usage: Open the app you want to inspect → launch Accessibility Insights → hover or use keyboard navigation to explore the UI tree (Name, Role, ControlType, AutomationId).

macOS

  • Tool: Accessibility Inspector (included with Xcode)
  • Alt: osascript -e 'tell application "System Events" to entire contents of window 1 of application process "Safari"'
  • Usage: Open Xcode → Developer Tools → Accessibility Inspector. Select the target process and inspect elements' AXRole, AXTitle, AXIdentifier attributes.
  • Requirement: Grant Accessibility permissions to your automation script/terminal (System Preferences → Privacy & Security → Accessibility).

Linux (Experimental)

  • Tool: Accerciser (GNOME Accessibility Inspector)
  • Install: sudo apt install accerciser or sudo dnf install accerciser
  • Usage: Open Accerciser → navigate the AT-SPI tree to find element names, roles, and states.

These tools show you the Name, Role, ControlType, and other metadata used in ComputerUse selectors.

Platform Support

Platform CLI MCP Agent Automation Installation Method
Windows npm/bunx
macOS npm/bunx (requires Accessibility permissions)
Linux npm/bunx (requires AT-SPI2, wmctrl/xdotool)

Note:

  • macOS requires Accessibility permissions for your terminal/app
  • Linux requires AT-SPI2 (default on GNOME/KDE) and wmctrl/xdotool for X11 window management

Troubleshooting

For detailed troubleshooting, debugging, and MCP server logs, send us a message.

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests. many parts are experimental, and help is appreciated.