# tauri-connector
[](https://crates.io/crates/tauri-plugin-connector)
[](LICENSE)
A Tauri v2 plugin with **embedded MCP server** + Rust CLI for deep inspection and interaction with Tauri desktop applications. Drop-in replacement for `tauri-plugin-mcp-bridge` that **fixes the `__TAURI__ not available` bug** on macOS.
## The Problem
`tauri-plugin-mcp-bridge` injects JavaScript into the webview that relies on `window.__TAURI__` to send execution results back to Rust. On macOS with WKWebView, the injected scripts run in an isolated content world where `window.__TAURI__` doesn't exist -- causing all JS-based tools (execute_js, dom_snapshot, console logs) to time out.
## The Fix
tauri-connector uses a **dual-path JS execution** strategy:
1. **WS Bridge (primary)** -- A small JS client injected into the webview connects back to the plugin via `ws://127.0.0.1:{port}`. Scripts and results flow through this dedicated WebSocket channel.
2. **Eval+Event fallback** -- If the WS bridge times out (2s), the plugin falls back to injecting JS via Tauri's `window.eval()` and receiving results through Tauri's event system. This path requires `withGlobalTauri: true`.
The fallback is transparent -- callers get the same result regardless of which path succeeds. The **MCP server runs inside the plugin** -- when your Tauri app starts, it starts automatically.
```
Frontend JS (app context)
'-- WebSocket ws://127.0.0.1:9300 --------> Bridge (JS execution, path 1)
Plugin (Rust)
|-- bridge.execute_js()
|-- xcap native capture (cross-platform) --> PNG/JPEG/WebP with resize
'-- snapdom fallback ---------------------> DOM-to-image via @zumer/snapdom
Claude Code -------- SSE http://host:9556/sse -----> Embedded MCP server
|-- handlers (direct, in-process)
|-- bridge.execute_js() -> JS result
'-- state.get_dom() -> cached DOM
CLI (Rust) -------- WebSocket ws://host:9555 -----> Plugin WS server
```
## Components
| `plugin/` | Rust Tauri v2 plugin with **embedded MCP server** (`tauri-plugin-connector` on crates.io) |
| `crates/cli/` | Rust CLI binary with ref-based element addressing |
| `crates/mcp-server/` | Standalone Rust MCP server (alternative to embedded, connects via WebSocket) |
| `crates/client/` | Shared Rust WebSocket client library |
## Features
### 25 Tools (MCP + CLI)
Every tool is available via both the embedded MCP server (for Claude Code) and the Rust CLI (for terminal use). The CLI uses ref-based element addressing inspired by [vercel-labs/agent-browser](https://github.com/vercel-labs/agent-browser).
| JavaScript | `webview_execute_js` | `eval <script>` |
| DOM | `webview_dom_snapshot` | `snapshot [-i] [-c] [--mode ai\|accessibility\|structure]` |
| DOM (cached) | `get_cached_dom` | `dom` |
| Elements | `webview_find_element` | `find <selector> [-s css\|xpath\|text]` |
| Styles | `webview_get_styles` | `get styles <@ref\|selector>` |
| Picker | `webview_get_pointed_element` | `pointed` |
| Select | `webview_select_element` | *(visual picker, not yet implemented)* |
| Interact | `webview_interact` | `click`, `dblclick`, `hover`, `focus`, `fill`, `type`, `check`, `uncheck`, `select`, `scroll`, `scrollintoview` |
| Keyboard | `webview_keyboard` | `press <key>` |
| Wait | `webview_wait_for` | `wait <selector> [--text] [--timeout]` |
| Screenshot | `webview_screenshot` | `screenshot <path> [-f png\|jpeg\|webp] [-m maxWidth]` |
| Windows | `manage_window` | `windows`, `resize <w> <h>` |
| State | `ipc_get_backend_state` | `state` |
| IPC | `ipc_execute_command` | `ipc exec <cmd> [-a '{...}']` |
| Monitor | `ipc_monitor` | `ipc monitor` / `ipc unmonitor` |
| Captured | `ipc_get_captured` | `ipc captured [-f filter]` |
| Events | `ipc_emit_event` | `emit <event> [-p '{...}']` |
| Logs | `read_logs` | `logs [-n 20] [-f filter]` |
| Logs | `clear_logs` | `clear logs\|ipc\|events\|all` |
| Logs | `read_log_file` | *(via MCP only)* |
| Events | `ipc_listen` | `events listen\|captured\|stop` |
| Events | `event_get_captured` | `events captured [-p regex]` |
| DOM | `webview_search_snapshot` | *(via MCP only)* |
| Setup | `get_setup_instructions` | `examples` |
| Devices | `list_devices` | *(info only)* |
### CLI Ref-Based Addressing
Take a DOM snapshot with stable ref IDs, then interact with elements using those refs:
```bash
# Take snapshot -- assigns ref IDs, enriches with React component names, stitches portals
$ tauri-connector snapshot -i
- complementary [component=MainMenu]
- menu [ref=e6, component=InheritableContextProvider]
- menuitem [ref=e7, component=LegacyMenuItem]
- img "calendar" [ref=e21]
- main [component=MainMenu]
- heading "Task Centre" [level=1, ref=e103]
- textbox "Search..." [ref=e16]
- combobox "Status" [ref=e50, component=InternalSelect, expanded=true]:
- listbox "Status options" [portal]: # stitched from document.body
- option "Active" [selected, ref=e51]
- option "Inactive" [ref=e52]
# Interact using refs (persist across CLI invocations)
$ tauri-connector click @e51 # Click "Active" option in portal
$ tauri-connector fill @e16 "aspirin" # Fill search box
$ tauri-connector hover @e7 # Hover menu item
$ tauri-connector get text @e103 # Get "Task Centre"
$ tauri-connector press Enter # Press key
$ tauri-connector logs -n 5 # Last 5 console logs
```
### Unified Snapshot Engine (v0.5)
The DOM snapshot engine uses a single `window.__CONNECTOR_SNAPSHOT__()` function with three modes:
| `ai` (default) | Role, name, ARIA states, `ref=eN` IDs, `component=Name`, portal stitching | Claude interaction |
| `accessibility` | Role, accessible name, ARIA states | Semantic understanding |
| `structure` | Tag, id, classes, data-testid | Layout debugging |
Key capabilities:
- **TreeWalker API** for ~78x faster traversal (no stack overflow on deep React trees)
- **Portal stitching** -- Ant Design modals/drawers/dropdowns are logically re-parented under their trigger via `aria-controls`/`aria-owns`
- **React fiber enrichment** -- Component names (`component=AppointmentModal`) via `__reactFiber$` on DOM nodes
- **Visibility pruning** -- `aria-hidden`, `display:none`, `visibility:hidden`, `role=presentation/none` automatically excluded
- **Virtual scroll detection** -- `rc-virtual-list-holder` containers annotated with visible item count
- **Token budgeting** -- `maxDepth` and `maxElements` for graceful truncation on large DOMs
The plugin also auto-pushes DOM snapshots via Tauri IPC. The `get_cached_dom` tool returns this pre-cached snapshot instantly.
## Quick Start
> **Using Claude Code?** Install the skill for automated setup -- see [Claude Code Skill](#claude-code-skill-recommended) below.
### 1. Add the plugin
```toml
# src-tauri/Cargo.toml
[dependencies]
tauri-plugin-connector = "0.6"
```
### 2. Register it (debug-only)
```rust
// src-tauri/src/lib.rs -- place BEFORE .invoke_handler()
#[cfg(debug_assertions)]
{
builder = builder.plugin(tauri_plugin_connector::init());
}
```
### 3. Add permission
```json
// src-tauri/capabilities/default.json -- add to permissions array
"connector:default"
```
### 4. Set `withGlobalTauri` (required)
```json
// src-tauri/tauri.conf.json
{ "app": { "withGlobalTauri": true } }
```
### 5. Install snapdom (screenshot fallback)
```bash
# In your frontend project
npm install @zumer/snapdom # or: bun add @zumer/snapdom
```
If your project uses Vite/webpack, no extra setup needed. Otherwise expose on window:
```typescript
import { snapdom } from '@zumer/snapdom';
window.snapdom = snapdom;
```
### 6. Configure Claude Code
```json
// .mcp.json -- the MCP server starts automatically with the app
{
"mcpServers": {
"tauri-connector": {
"url": "http://127.0.0.1:9556/sse"
}
}
}
```
### 7. Run
```bash
bun run tauri dev
```
Look for:
```
[connector][mcp] MCP ready for 'MyApp' -- url: http://0.0.0.0:9556/sse
[connector] Plugin ready for 'MyApp' (com.example.app) -- WS on 0.0.0.0:9555
```
The MCP server is now live. Claude Code connects automatically via the URL in `.mcp.json`.
## WebSocket API via Bun
Connect directly to the plugin WebSocket on port 9555 using `bun -e`. No build step or extra dependencies -- bun has native WebSocket support.
### Execute JavaScript
```bash
bun -e "
const ws = new WebSocket('ws://127.0.0.1:9555');
ws.onopen = () => ws.send(JSON.stringify({
id: '1', type: 'execute_js',
script: '(() => ({ title: document.title, url: location.href }))()',
window_id: 'main'
}));
ws.onmessage = (e) => { console.log(JSON.parse(e.data)); ws.close(); };
setTimeout(() => process.exit(1), 15000);
"
```
### Take Screenshot
```bash
bun -e "
const fs = require('fs');
const ws = new WebSocket('ws://127.0.0.1:9555');
ws.onopen = () => ws.send(JSON.stringify({
id: '1', type: 'screenshot',
format: 'png', quality: 80, max_width: 1280, window_id: 'main'
}));
ws.onmessage = (e) => {
const r = JSON.parse(e.data);
if (r.result?.base64) {
fs.writeFileSync('/tmp/screenshot.png', Buffer.from(r.result.base64, 'base64'));
console.log('Saved /tmp/screenshot.png', r.result.width + 'x' + r.result.height);
} else { console.log(r); }
ws.close();
};
setTimeout(() => process.exit(1), 60000);
"
```
### DOM Snapshot / Click / Type
```bash
# AI snapshot (default mode -- includes refs, component names, portal stitching)
bun -e "
const ws = new WebSocket('ws://127.0.0.1:9555');
ws.onopen = () => ws.send(JSON.stringify({
id: '1', type: 'dom_snapshot', mode: 'ai', window_id: 'main'
}));
ws.onmessage = (e) => { console.log(JSON.parse(e.data).result); ws.close(); };
setTimeout(() => process.exit(1), 15000);
"
# Click an element
bun -e "
const ws = new WebSocket('ws://127.0.0.1:9555');
ws.onopen = () => ws.send(JSON.stringify({
id: '1', type: 'interact', action: 'click', selector: 'button.submit', window_id: 'main'
}));
ws.onmessage = (e) => { console.log(JSON.parse(e.data)); ws.close(); };
setTimeout(() => process.exit(1), 15000);
"
# Type text into focused element
bun -e "
const ws = new WebSocket('ws://127.0.0.1:9555');
ws.onopen = () => ws.send(JSON.stringify({
id: '1', type: 'keyboard', action: 'type', text: 'hello', window_id: 'main'
}));
ws.onmessage = (e) => { console.log(JSON.parse(e.data)); ws.close(); };
setTimeout(() => process.exit(1), 15000);
"
```
### App State / Logs / Windows
```bash
# App metadata (no bridge needed)
bun -e "
const ws = new WebSocket('ws://127.0.0.1:9555');
ws.onopen = () => ws.send(JSON.stringify({ id: '1', type: 'backend_state' }));
ws.onmessage = (e) => { console.log(JSON.parse(e.data)); ws.close(); };
setTimeout(() => process.exit(1), 5000);
"
# Console logs
bun -e "
const ws = new WebSocket('ws://127.0.0.1:9555');
ws.onopen = () => ws.send(JSON.stringify({
id: '1', type: 'console_logs', lines: 20, window_id: 'main'
}));
ws.onmessage = (e) => { console.log(JSON.parse(e.data)); ws.close(); };
setTimeout(() => process.exit(1), 5000);
"
```
### WS Command Reference
All commands use `{ id, type, ...params }` with snake_case types:
| `ping` | -- |
| `execute_js` | `script`, `window_id` |
| `screenshot` | `format`, `quality`, `max_width`, `window_id` |
| `dom_snapshot` | `mode` (ai/accessibility/structure), `selector`, `max_depth`, `max_elements`, `react_enrich`, `follow_portals`, `shadow_dom`, `window_id` |
| `find_element` | `selector`, `strategy`, `window_id` |
| `get_styles` | `selector`, `properties`, `window_id` |
| `interact` | `action`, `selector`, `strategy`, `x`, `y`, `window_id` |
| `keyboard` | `action`, `text`, `key`, `modifiers`, `window_id` |
| `wait_for` | `selector`, `strategy`, `text`, `timeout`, `window_id` |
| `window_list` / `window_info` / `window_resize` | `window_id`, `width`, `height` |
| `backend_state` | -- |
| `ipc_execute_command` | `command`, `args` |
| `ipc_monitor` | `action` |
| `ipc_get_captured` | `filter`, `limit`, `pattern`, `since` |
| `ipc_emit_event` | `event_name`, `payload` |
| `console_logs` | `lines`, `filter`, `level`, `pattern`, `since`, `window_id` |
| `clear_logs` | `source` |
| `read_log_file` | `source`, `lines`, `level`, `pattern`, `since`, `window_id` |
| `ipc_listen` | `action`, `events` |
| `event_get_captured` | `event`, `pattern`, `limit`, `since` |
| `search_snapshot` | `pattern`, `context`, `mode`, `window_id` |
## Rust CLI (Alternative)
A Rust CLI with ref-based element addressing is also available:
```bash
# Homebrew (macOS/Linux)
brew install dickwu/tap/tauri-connector
# Or build from source
cargo build -p connector-cli --release
# Binary at target/release/tauri-connector
```
```bash
tauri-connector snapshot -i # AI snapshot with refs + component names
tauri-connector snapshot -i --mode accessibility # Accessibility tree only
tauri-connector snapshot -i --no-react # Skip React enrichment
tauri-connector snapshot -i --no-portals # Skip portal stitching
tauri-connector snapshot -i --max-elements 2000 # Limit output size
tauri-connector click @e5 # Click by ref
tauri-connector fill @e3 "query" # Fill input
tauri-connector get text @e7 # Get text
tauri-connector press Enter # Press key
tauri-connector screenshot /tmp/s.png -m 1280 # Screenshot
tauri-connector find "Submit" -s text # Find elements
tauri-connector dom # Cached DOM from frontend
tauri-connector logs -n 10 # Console logs
tauri-connector state # App metadata
tauri-connector resize 1024 768 # Resize window
tauri-connector ipc exec greet -a '{"name":"world"}' # IPC command
tauri-connector ipc monitor # Start IPC monitoring
tauri-connector ipc captured -f greet # Get captured IPC
tauri-connector emit my-event -p '{"foo":42}' # Emit event
tauri-connector pointed # Alt+Shift+Click element info
tauri-connector logs -n 10 -l error # Error logs only
tauri-connector logs -p "user_\\d+" # Regex filter
tauri-connector events listen user:login # Listen for events
tauri-connector events captured # Get captured events
tauri-connector events stop # Stop listening
tauri-connector clear all # Clear all log files
```
Environment: `TAURI_CONNECTOR_HOST` (default `127.0.0.1`), `TAURI_CONNECTOR_PORT` (default `9555`).
## Claude Code Skill (Recommended)
Install the included skill to let Claude Code automatically set up and use tauri-connector.
### Install
```bash
mkdir -p ~/.claude/skills/tauri-connector
cp skill/SKILL.md ~/.claude/skills/tauri-connector/SKILL.md
cp skill/SETUP.md ~/.claude/skills/tauri-connector/SETUP.md
cp -r skill/scripts ~/.claude/skills/tauri-connector/scripts
```
### What It Does
Once installed, Claude will automatically:
- **Set up the plugin** in any Tauri project when asked
- **Use the CLI** for DOM snapshots and element interactions
- **Debug issues** using console logs, app state, and JS execution
- **Automate testing** with snapshot -> click/fill/verify workflows
> **For contributors:** The release workflow skill is at `.claude/skills/tauri-connector-release/SKILL.md` — it triggers automatically when you say "release" or "bump version" inside this repo.
## MCP Server
### Embedded (Default)
The MCP server starts automatically inside the Tauri plugin when the app runs. Configure Claude Code with:
```json
{
"mcpServers": {
"tauri-connector": {
"url": "http://127.0.0.1:9556/sse"
}
}
}
```
No separate process, no Node.js, no install step. Just run your Tauri app.
### Standalone (Alternative)
A standalone Rust MCP binary is also available for cases where you can't modify the Tauri app:
```bash
cargo build -p connector-mcp-server --release
# Binary at target/release/tauri-connector-mcp
```
```json
{
"mcpServers": {
"tauri-connector": {
"command": "tauri-connector-mcp",
"env": {
"TAURI_CONNECTOR_HOST": "127.0.0.1",
"TAURI_CONNECTOR_PORT": "9555"
}
}
}
}
```
## Plugin Configuration
```rust
use tauri_plugin_connector::ConnectorBuilder;
#[cfg(debug_assertions)]
{
builder = builder.plugin(
ConnectorBuilder::new()
.bind_address("127.0.0.1") // localhost only (default: 0.0.0.0)
.port_range(8000, 8100) // WS port range (default: 9555-9655)
.mcp_port_range(8100, 8200) // MCP port range (default: 9556-9656)
.build()
);
}
// Or disable the embedded MCP server:
ConnectorBuilder::new()
.disable_mcp()
.build()
```
## Frontend Integration (Optional)
Push DOM snapshots from your frontend for instant LLM access:
```typescript
import { invoke } from '@tauri-apps/api/core';
// The bridge auto-pushes DOM snapshots on page load and significant mutations.
// For manual push (e.g. after a custom state change):
const result = window.__CONNECTOR_SNAPSHOT__({ mode: 'ai', maxElements: 5000 });
windowId: 'main',
html: document.body.innerHTML.substring(0, 500000),
textContent: document.body.innerText.substring(0, 200000),
snapshot: result.snapshot,
snapshotMode: 'ai',
refs: JSON.stringify(result.refs),
meta: JSON.stringify(result.meta),
}
});
```
The bridge JS auto-pushes DOM on page load and significant mutations (5s debounce) when `window.__TAURI_INTERNALS__` is available.
### Alt+Shift+Click Element Picker
Alt+Shift+Click any element in the app to capture its metadata. Retrieve via `webview_get_pointed_element` MCP tool.
## Project Structure
```
tauri-connector/
|-- Cargo.toml # Workspace root
|-- plugin/ # Rust Tauri v2 plugin (crates.io)
| |-- lib.rs # Plugin entry + Tauri IPC commands
| |-- bridge.rs # Internal WebSocket bridge (the fix)
| |-- server.rs # External WebSocket server (for CLI)
| |-- mcp.rs # Embedded MCP SSE server
| |-- mcp_tools.rs # MCP tool definitions + dispatch
| |-- handlers.rs # All command handlers
| |-- protocol.rs # Message types
| '-- state.rs # Shared state (DOM cache, logs, IPC)
|-- crates/
| |-- client/ # Shared Rust WebSocket client
| | '-- src/lib.rs
| |-- mcp-server/ # Standalone MCP server (alternative)
| | '-- src/
| | |-- main.rs # Stdio JSON-RPC loop
| | |-- protocol.rs # JSON-RPC types
| | '-- tools.rs # Tool definitions + dispatch
| '-- cli/ # Rust CLI binary
| '-- src/
| |-- main.rs # Clap CLI entry point
| |-- commands.rs # Command implementations
| '-- snapshot.rs # Ref system + DOM snapshot builder
|-- skill/ # Claude Code skill + bun scripts
| |-- SKILL.md # Usage guide (loaded by Claude Code)
| |-- SETUP.md # Setup instructions
| '-- scripts/ # Bun scripts for WS interaction
| |-- connector.ts # Shared helper (auto-discovers ports via PID file)
| |-- state.ts, eval.ts, screenshot.ts, snapshot.ts
| |-- click.ts, fill.ts, find.ts, wait.ts
| '-- logs.ts, windows.ts
|-- LICENSE
'-- README.md
```
## How It Works
### JS Execution (Dual Path)
The bridge uses two execution paths for maximum reliability:
1. **WS Bridge (primary, 2s timeout)**: Internal WebSocket on `127.0.0.1:9300-9400`. Bridge JS injected into the webview connects back, executes scripts via `AsyncFunction`, and returns results through the WebSocket. Uses `tokio::select!` for multiplexed read/write on a single stream.
2. **Eval+Event fallback**: If the WS bridge times out, the plugin injects JS via Tauri's `window.eval()` and receives results through Tauri's event system (`plugin:event|emit`). Requires `withGlobalTauri: true`. Handles double-serialized event payloads automatically.
The fallback is transparent -- `bridge.execute_js()` returns the same result regardless of which path succeeded.
### Screenshot
The `webview_screenshot` tool uses a tiered approach:
1. **xcap native capture** (cross-platform): Uses the [xcap](https://github.com/nashaofu/xcap) crate for pixel-accurate window capture on Windows, macOS, and Linux. Matches the window by title, captures via native OS APIs, then resizes (`maxWidth`) and encodes to PNG/JPEG/WebP via the `image` crate. Runs on a blocking thread to avoid stalling the Tokio runtime.
2. **snapdom fallback**: When xcap is unavailable (e.g. Wayland without permissions, CI environments), falls back to [snapdom](https://github.com/zumerlab/snapdom) (`@zumer/snapdom`) — a fast DOM-to-image library that captures exactly what the web engine renders. Loaded via dynamic `import()` or `window.snapdom` global. No CDN dependency, works fully offline.
### PID File Auto-Discovery
When the plugin starts, it writes `target/.connector.json` with all port info:
```json
{ "pid": 12345, "ws_port": 9555, "mcp_port": 9556, "bridge_port": 9300, "app_name": "MyApp", "app_id": "com.example.app" }
```
The bun scripts in `skill/scripts/` auto-discover this file, verify the PID is alive, and connect without any env vars. If the Tauri app is already running in another terminal, the scripts connect directly -- no need to start a new instance.
### Embedded MCP Server
1. Plugin starts an SSE HTTP server on port 9556 (configurable)
2. Claude Code connects via `GET /sse` and receives an SSE event stream
3. Tool calls arrive via `POST /message` with JSON-RPC bodies
4. Handlers call the bridge and plugin state directly -- zero WebSocket overhead
### Console Log Capture
The bridge intercepts `console.log/warn/error/info/debug`, storing entries in file-backed JSONL storage at `{app_data_dir}/.tauri-connector/console.log`. Logs persist across app restarts. Accessible via `read_logs`, `read_log_file`, or auto-pushed to Rust via `invoke()`.
### Ref System
The unified snapshot engine assigns sequential ref IDs (`e0`, `e1`, ...) to interactive elements (buttons, links, inputs, checkboxes, etc.) and elements with `onclick`, `tabindex`, or `cursor:pointer`. Three ref formats are accepted: `@e1`, `ref=e1`, or `e1`. Refs are persisted to disk and used across subsequent CLI invocations until the next `snapshot` refreshes them. The ref resolution uses a three-strategy fallback: CSS selector, then role+name text matching, then `[role="..."]` attribute matching.
## Requirements
- Tauri v2.x
- Rust 2024 edition
- [Bun](https://bun.sh/) (for skill scripts / WebSocket API examples)
- `@zumer/snapdom` in frontend (optional, for screenshot fallback when xcap is unavailable)
## License
MIT