docs.rs failed to build tauri-plugin-agent-control-0.1.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

tauri-plugin-agent-control

Dev-only HTTP bridge for Tauri webviews — like Chrome DevTools Protocol, but for Tauri.

What it does

tauri-plugin-agent-control exposes your Tauri webview over a local HTTP API at http://localhost:9876. You can snapshot the DOM, click buttons, fill inputs, evaluate JavaScript, intercept network requests, take screenshots, and more — all via simple curl commands or any HTTP client. The server only runs in debug builds, so it's completely stripped from production.

Installation

1. Add the Rust dependency

From crates.io:

[dependencies]
tauri-plugin-agent-control = "0.1"

Or from git:

[dependencies]
tauri-plugin-agent-control = { git = "https://github.com/palanik1/tauri-plugin-agent-control" }

2. Register the plugin

In your src-tauri/src/main.rs (or lib.rs):

fn main() {
    tauri::Builder::default()
        .plugin(tauri_plugin_agent_control::init())
        .run(tauri::generate_context!())
        .expect("error while running tauri application");
}

3. Add capability permissions

In your src-tauri/capabilities/default.json (create it if it doesn't exist):

{
  "identifier": "default",
  "windows": ["main"],
  "permissions": [
    "agent-control:default"
  ]
}

agent-control:default grants allow-agent-control-respond — the only Tauri command the plugin registers. It allows the JS shim to send eval results back to the Rust HTTP server.

4. Run in dev mode

cargo tauri dev

The HTTP bridge starts automatically on http://localhost:9876 in debug builds. Verify it's running:

curl -s http://localhost:9876/health
# {"ok":true}

Production builds: The server is completely stripped — it's gated behind #[cfg(debug_assertions)]. No code, no port, no surface area.

AI Agent Setup

Install the Amp/AI agent skill

Copy the SKILL.md from this package into your project's skill directory:

# Project-level (recommended)
mkdir -p .agents/skills/tauri-agent-control
cp node_modules/tauri-plugin-agent-control/SKILL.md .agents/skills/tauri-agent-control/SKILL.md

# Or user-level (available across all projects)
mkdir -p ~/.config/agents/skills/tauri-agent-control
cp node_modules/tauri-plugin-agent-control/SKILL.md ~/.config/agents/skills/tauri-agent-control/SKILL.md

Or install via npm and copy:

npm install tauri-plugin-agent-control
cp node_modules/tauri-plugin-agent-control/SKILL.md .agents/skills/tauri-agent-control/SKILL.md

Once installed, your AI agent (Amp, Claude Code, etc.) can interact with your running Tauri app via natural language:

"Take a screenshot of the app"
"Click the Submit button"
"Fill in the email field with test@example.com"
"Check if the sidebar is visible"

The agent uses the skill to translate these into the appropriate curl commands to the HTTP bridge.

Quick Start

1. Snapshot the page to see what's on screen:

curl http://localhost:9876/snapshot?format=compact

[page] My App — http://localhost:1420/
  @e1 [button] "Submit" role="button"
  @e2 [input type="text"] placeholder="Enter name" name="username"
  @e3 [a] "Home" href="/"

2. Interact with elements using their refs:

# Fill an input
curl -X POST http://localhost:9876/fill \
  -d '{"ref":"@e2","text":"hello world"}'

# Click a button
curl -X POST http://localhost:9876/click \
  -d '{"ref":"@e1"}'

3. Re-snapshot to see the updated state:

curl http://localhost:9876/snapshot?format=compact

4. Evaluate arbitrary JavaScript:

curl -X POST http://localhost:9876/eval \
  -d '{"code":"document.title"}'

API Reference

Snapshot

Method	Path	Query / Body	Description
`GET`	`/snapshot`	`?format=compact\|json` `&scope=<selector>` `&depth=<n>` `&interactive=true\|false`	Snapshot visible elements. `compact` returns a text tree, `json` returns full element data. `interactive=false` includes all elements, not just interactive ones.

Interactions

Method	Path	Body	Description
`POST`	`/click`	`{"ref":"@e1"}`	Click an element
`POST`	`/dblclick`	`{"ref":"@e1"}`	Double-click an element
`POST`	`/hover`	`{"ref":"@e1"}`	Hover over an element (mouseenter + mouseover)
`POST`	`/focus`	`{"ref":"@e1"}`	Focus an element
`POST`	`/fill`	`{"ref":"@e1","text":"value"}`	Set input/textarea value directly (fires input + change)
`POST`	`/type`	`{"ref":"@e1","text":"value"}`	Type text character-by-character (fires keydown/keypress/input/keyup per char)
`POST`	`/press`	`{"key":"Enter"}`	Press a key. Supports modifiers: `"Control+a"`, `"Meta+Shift+z"`
`POST`	`/check`	`{"ref":"@e1"}`	Check a checkbox
`POST`	`/uncheck`	`{"ref":"@e1"}`	Uncheck a checkbox
`POST`	`/select`	`{"ref":"@e1","values":["opt1","opt2"]}`	Select options in a `<select>` element
`POST`	`/scroll`	`{"direction":"down","amount":500}`	Scroll the page. Optional `"selector"` to scroll a container. Directions: `up`, `down`, `left`, `right`
`POST`	`/scrollintoview`	`{"ref":"@e1"}`	Scroll an element into view (smooth, centered)
`POST`	`/drag`	`{"from":"@e1","to":"@e2"}`	Drag and drop between two elements
`POST`	`/upload`	`{"ref":"@e1","dataUrl":"data:..."}`	Upload a file to a file input via base64 data URL

Getters

Method	Path	Description
`GET`	`/get/title`	Get `document.title`
`GET`	`/get/url`	Get `location.href`
`GET`	`/get/text/{ref}`	Get `textContent` of an element
`GET`	`/get/html/{ref}`	Get `innerHTML` of an element
`GET`	`/get/value/{ref}`	Get `value` of an input/textarea
`GET`	`/get/attr/{ref}/{attr}`	Get an attribute value
`GET`	`/get/styles/{ref}`	Get computed styles (font, color, display, position, etc.)
`GET`	`/get/box/{ref}`	Get bounding box `{x, y, width, height}`
`GET`	`/get/count/{selector}`	Count elements matching a CSS selector (URL-encode the selector)

State Checks

Method	Path	Description
`GET`	`/is/visible/{ref}`	Check if element is visible
`GET`	`/is/enabled/{ref}`	Check if element is enabled (not disabled)
`GET`	`/is/checked/{ref}`	Check if checkbox/radio is checked

Wait

Method	Path	Body	Description
`POST`	`/wait`	`{"ms":1000}`	Wait for a fixed delay
`POST`	`/wait`	`{"selector":".loaded"}`	Wait for a CSS selector to appear
`POST`	`/wait`	`{"text":"Success"}`	Wait for text to appear on the page
`POST`	`/wait`	`{"url":"**/dashboard"}`	Wait for URL to match (glob patterns supported)
`POST`	`/wait`	`{"fn":"document.querySelector('.x')"}`	Wait for a JS expression to be truthy
`POST`	`/wait`	`{"selector":".x","timeout":10000}`	Custom timeout (default: 5000ms)

Eval

Method	Path	Body	Description
`POST`	`/eval`	`{"code":"document.title"}`	Evaluate arbitrary JavaScript and return the result

Console / Errors

Method	Path	Description
`GET`	`/console`	Get and drain captured console logs (log, warn, error, info, debug)
`GET`	`/errors`	Get and drain captured console errors only

Cookies

Method	Path	Body	Description
`GET`	`/cookies`	—	Get all cookies as `{name: value}`
`POST`	`/cookies/set`	`{"name":"x","value":"y"}`	Set a cookie
`POST`	`/cookies/clear`	—	Clear all cookies

Storage

Method	Path	Body	Description
`GET`	`/storage/local`	—	Get all localStorage entries
`GET`	`/storage/session`	—	Get all sessionStorage entries
`GET`	`/storage/local/{key}`	—	Get a single localStorage key
`GET`	`/storage/session/{key}`	—	Get a single sessionStorage key
`POST`	`/storage/set`	`{"type":"local","key":"k","value":"v"}`	Set a storage entry. `type` is `"local"` or `"session"`
`POST`	`/storage/clear`	`{"type":"local"}`	Clear all entries for a storage type

Navigation

Method	Path	Description
`POST`	`/back`	Navigate back (`history.back()`)
`POST`	`/forward`	Navigate forward (`history.forward()`)
`POST`	`/reload`	Reload the page (`location.reload()`)

Find (Semantic Locators)

Method	Path	Body	Description
`POST`	`/find`	`{"by":"role","value":"button","name":"Submit"}`	Find by ARIA role with optional accessible name
`POST`	`/find`	`{"by":"text","value":"Hello"}`	Find by visible text content
`POST`	`/find`	`{"by":"label","value":"Email"}`	Find by associated label text or `aria-label`
`POST`	`/find`	`{"by":"placeholder","value":"Search..."}`	Find by placeholder attribute
`POST`	`/find`	`{"by":"testid","value":"submit-btn"}`	Find by `data-testid`
`POST`	`/find`	`{"by":"alt","value":"Logo"}`	Find by `alt` attribute
`POST`	`/find`	`{"by":"title","value":"Close"}`	Find by `title` attribute
`POST`	`/find`	`{"by":"first","value":"button"}`	First element matching selector
`POST`	`/find`	`{"by":"last","value":".item"}`	Last element matching selector
`POST`	`/find`	`{"by":"nth","value":"li","index":2}`	Nth element matching selector (0-based)

All /find calls return {"ref":"@eN","tag":"...","text":"..."} and accept optional "action" (click, fill, type, hover, focus), "text" (for fill/type), and "exact":true for exact text matching.

Mouse / Keyboard

Method	Path	Body	Description
`POST`	`/mouse`	`{"action":"move","x":100,"y":200}`	Move mouse to coordinates
`POST`	`/mouse`	`{"action":"down","x":100,"y":200,"button":"left"}`	Mouse button down
`POST`	`/mouse`	`{"action":"up","x":100,"y":200}`	Mouse button up
`POST`	`/mouse`	`{"action":"wheel","deltaY":100}`	Mouse wheel scroll
`POST`	`/keydown`	`{"key":"Shift"}`	Dispatch keydown event
`POST`	`/keyup`	`{"key":"Shift"}`	Dispatch keyup event

Highlight

Method	Path	Body	Description
`POST`	`/highlight`	`{"ref":"@e1"}`	Visually highlight an element with a red overlay (2s duration)

Dialogs

Method	Path	Body	Description
`POST`	`/dialog`	`{"action":"accept"}`	Auto-accept future alert/confirm/prompt dialogs
`POST`	`/dialog`	`{"action":"accept","text":"input"}`	Accept with custom prompt text
`POST`	`/dialog`	`{"action":"dismiss"}`	Auto-dismiss future dialogs
`GET`	`/dialogs`	—	Get and drain captured dialog events

Viewport

Method	Path	Body	Description
`POST`	`/viewport`	`{"width":375,"height":812}`	Resize the webview window

State

Method	Path	Body	Description
`POST`	`/state/save`	—	Save current localStorage, sessionStorage, and cookies
`POST`	`/state/load`	`{"localStorage":{...},"sessionStorage":{...},"cookies":{...}}`	Restore previously saved state

Screenshot

Method	Path	Query	Description
`GET`	`/screenshot`	`?path=/tmp/shot.png` `&full=true`	Take a screenshot. Default path: `/tmp/agent-control-screenshot-<ts>.png`. `full=true` resizes to full page dimensions first.

Recording

Method	Path	Description
`POST`	`/record/start`	Start screen recording (macOS `screencapture -v`). Returns `{pid, path}`
`POST`	`/record/stop`	Stop recording (sends SIGINT to screencapture process). Returns `{path}`

Download

Method	Path	Body	Description
`POST`	`/download`	`{"url":"https://...","path":"/tmp/file.zip"}`	Download a file using `curl`

Network

Method	Path	Body	Description
`POST`	`/network/intercept`	—	Start intercepting fetch/XHR requests
`POST`	`/network/reset`	—	Stop intercepting and restore original fetch/XHR
`GET`	`/network/requests`	—	Get captured network request log
`POST`	`/network/route`	`{"url":"/api/data","body":"{\"mock\":true}"}`	Mock responses for matching URLs
`POST`	`/network/route`	`{"url":"/api/data","abort":true}`	Block requests to matching URLs

Geolocation

Method	Path	Body	Description
`POST`	`/geo`	`{"lat":37.7749,"lng":-122.4194}`	Override `navigator.geolocation`

Offline

Method	Path	Body	Description
`POST`	`/offline`	`{"enabled":true}`	Simulate offline mode (`navigator.onLine` → false)

Headers

Method	Path	Body	Description
`POST`	`/headers`	`{"headers":{"Authorization":"Bearer token"}}`	Inject custom headers into all future fetch requests

Trace

Method	Path	Description
`POST`	`/trace/start`	Start a `PerformanceObserver` trace (resource, paint, LCP, layout-shift, etc.)
`POST`	`/trace/stop`	Stop tracing and return collected performance entries

Health

Method	Path	Description
`GET`	`/health`	Returns `{"ok":true}` — useful to check if the agent-control server is running

Ref Lifecycle

Every call to /snapshot or /find assigns elements a ref like @e1, @e2, etc. These refs are invalidated whenever you take a new snapshot — the ref map is cleared and rebuilt from scratch. If the page navigates or the DOM changes significantly, take a fresh snapshot before interacting.

# 1. Snapshot → get refs
curl http://localhost:9876/snapshot?format=compact
# @e1 [button] "Save"

# 2. Use the ref
curl -X POST http://localhost:9876/click -d '{"ref":"@e1"}'

# 3. After DOM changes, snapshot again for fresh refs
curl http://localhost:9876/snapshot?format=compact
# @e1 [button] "Saved ✓"   ← refs are reassigned

Semantic Locators (Find)

The /find endpoint lets you locate elements without needing a snapshot first. It assigns a new ref to the found element and optionally performs an action in one call.

# Find a button by role and click it
curl -X POST http://localhost:9876/find \
  -d '{"by":"role","value":"button","name":"Submit","action":"click"}'

# Find an input by label and fill it
curl -X POST http://localhost:9876/find \
  -d '{"by":"label","value":"Email","action":"fill","text":"user@example.com"}'

# Find by test ID
curl -X POST http://localhost:9876/find \
  -d '{"by":"testid","value":"search-input","action":"type","text":"query"}'

# Find the 3rd list item (0-based index)
curl -X POST http://localhost:9876/find \
  -d '{"by":"nth","value":"li","index":2}'

Supported locator strategies: role, text, label, placeholder, testid, alt, title, first, last, nth.

Use "exact":true for exact text/name matching instead of substring matching.

Network Interception

Intercept, mock, and block network requests made by the webview:

# Start intercepting
curl -X POST http://localhost:9876/network/intercept

# Mock an API response
curl -X POST http://localhost:9876/network/route \
  -d '{"url":"/api/users","body":"{\"users\":[{\"name\":\"Alice\"}]}"}'

# Block requests to analytics
curl -X POST http://localhost:9876/network/route \
  -d '{"url":"analytics.example.com","abort":true}'

# View captured requests
curl http://localhost:9876/network/requests

# Reset (restore original fetch/XHR)
curl -X POST http://localhost:9876/network/reset

The interceptor patches window.fetch and XMLHttpRequest. Tauri IPC calls (ipc:// URLs) are automatically excluded to avoid breaking invoke().

Security

Debug-only: The HTTP server is gated behind #[cfg(debug_assertions)] and is completely compiled out of release builds.
Localhost-only: The server listens on localhost:9876. It is not accessible from other machines.
Not for production: This plugin gives full control over the webview — eval, DOM manipulation, cookie/storage access. Never ship it in a release build.

Platform Notes

Feature	Platform	Dependency
Screenshot (`/screenshot`)	macOS only	`screencapture` (built-in)
Recording (`/record/start`, `/record/stop`)	macOS only	`screencapture -v` (built-in)
Download (`/download`)	Cross-platform	`curl` (must be on PATH)

Architecture

The plugin is composed of three parts:

Rust HTTP server — A raw TCP server built with tokio::net::TcpListener that manually parses HTTP/1.1 requests. It runs as a spawned async task during plugin setup (debug builds only).
JS shim (guest-js/index.js) — Embedded at compile time via include_str!("../guest-js/index.js"). It's injected into the webview on every navigation using Tauri's on_navigation hook with a 200ms delay to let the new page settle. The shim exposes window.__DEVTOOLS__ with all DOM interaction methods.
Eval bridge — The communication channel between Rust and JS:
- Rust generates a unique 16-hex-char request ID
- Rust calls webview.eval() with wrapped JS that executes the expression
- The JS result is sent back via window.__TAURI_INTERNALS__.invoke('plugin:agent-control|agent_control_respond', {requestId, data})
- Rust receives the result through a tokio::sync::oneshot channel
- A configurable timeout (default 5s) cleans up pending requests

┌──────────┐   HTTP    ┌────────────┐  eval()   ┌───────────┐
│  Client  │ ───────── │ Rust TCP   │ ────────── │  Webview  │
│ (curl)   │           │ Server     │            │  (JS)     │
└──────────┘           └────────────┘  invoke()  └───────────┘
                             ▲ ──────────────────────┘
                             │  agent_control_respond(id, data)
                             │  via oneshot channel

License

MIT

tauri-plugin-agent-control 0.1.0