tauri-webdriver-automation-0.1.1 is not a library.

tauri-webdriver

Open-source macOS WebDriver for Tauri apps.

Enables automated end-to-end testing of Tauri desktop applications on macOS, where no native WKWebView WebDriver exists.

The Problem

Tauri apps use WKWebView on macOS. Unlike Linux (WebKitWebDriver) and Windows (Edge WebDriver), Apple does not provide a WebDriver implementation for WKWebView. This means Tauri developers cannot run automated e2e tests on macOS using standard WebDriver tools like WebDriverIO or Selenium.

This is a blocker for any Tauri app with platform-specific code (e.g., deep links, native menus, file associations) that must be tested on every platform.

The Solution

tauri-webdriver provides two crates that together bridge the gap:

tauri-plugin-webdriver-automation -- A Tauri plugin that runs inside your app (debug builds only). It starts a local HTTP server that can interact with your app's webview: find elements, click buttons, read text, manage windows, and execute JavaScript.
tauri-webdriver-automation (CLI binary: tauri-wd) -- A standalone CLI binary that implements the W3C WebDriver protocol. It launches your Tauri app, connects to the plugin's HTTP server, and translates standard WebDriver commands into plugin API calls. WebDriverIO, Selenium, or any W3C-compatible client can connect to it.

WebDriverIO/Selenium                 tauri-wd CLI                Your Tauri App
  (test runner)        ──HTTP──>    (W3C WebDriver)   ──HTTP──>  (plugin server)
                        :4444                                     :{dynamic port}

Who Is This For?

Tauri app developers who need automated e2e tests on macOS
CI/CD pipelines that run tests across macOS, Linux, and Windows
Anyone with platform-specific Tauri code that must be verified on macOS (deep links, native APIs, system integrations)

Quick Start

1. Add the plugin to your Tauri app

cd src-tauri
cargo add tauri-plugin-webdriver-automation

let mut builder = tauri::Builder::default();
#[cfg(debug_assertions)]
{
    builder = builder.plugin(tauri_plugin_webdriver_automation::init());
}

2. Install the CLI

cargo install tauri-webdriver-automation

3. Configure WebDriverIO

// wdio.conf.mjs
export const config = {
    port: 4444,
    capabilities: [{
        'tauri:options': {
            binary: './src-tauri/target/debug/my-app',
        }
    }],
    // ... your test config
};

4. Run tests

# Terminal 1: Start the WebDriver server
tauri-wd --port 4444

# Terminal 2: Run your tests
npx wdio run wdio.conf.mjs

Supported W3C WebDriver Operations

All operations follow the W3C WebDriver specification. See the full technical specification for detailed request/response formats and plugin API documentation.

Sessions

W3C Endpoint	Method	Description
`/status`	GET	Server readiness status
`/session`	POST	Create a new session with `tauri:options` capabilities
`/session/{id}`	DELETE	Delete session and terminate the app
`/session/{id}/timeouts`	GET	Get current timeout configuration
`/session/{id}/timeouts`	POST	Set implicit, page load, and script timeouts

Navigation

W3C Endpoint	Method	Description
`/session/{id}/url`	POST	Navigate to URL
`/session/{id}/url`	GET	Get current page URL
`/session/{id}/title`	GET	Get page title
`/session/{id}/source`	GET	Get full page HTML source
`/session/{id}/back`	POST	Navigate back in history
`/session/{id}/forward`	POST	Navigate forward in history
`/session/{id}/refresh`	POST	Refresh the current page

Windows

W3C Endpoint	Method	Description
`/session/{id}/window`	GET	Get current window handle
`/session/{id}/window`	POST	Switch to window by handle
`/session/{id}/window`	DELETE	Close current window
`/session/{id}/window/handles`	GET	Get all window handles
`/session/{id}/window/rect`	GET	Get window position and size
`/session/{id}/window/rect`	POST	Set window position and size
`/session/{id}/window/maximize`	POST	Maximize window
`/session/{id}/window/minimize`	POST	Minimize window
`/session/{id}/window/fullscreen`	POST	Make window fullscreen

Elements

W3C Endpoint	Method	Description
`/session/{id}/element`	POST	Find element (CSS, XPath, tag name, link text, partial link text)
`/session/{id}/elements`	POST	Find all matching elements
`/session/{id}/element/active`	GET	Get the currently focused element
`/session/{id}/element/{eid}/element`	POST	Find element scoped to a parent element
`/session/{id}/element/{eid}/elements`	POST	Find all elements scoped to a parent
`/session/{id}/element/{eid}/click`	POST	Click an element
`/session/{id}/element/{eid}/clear`	POST	Clear an input element
`/session/{id}/element/{eid}/value`	POST	Send keystrokes to an element
`/session/{id}/element/{eid}/text`	GET	Get element's visible text
`/session/{id}/element/{eid}/name`	GET	Get element's tag name
`/session/{id}/element/{eid}/attribute/{name}`	GET	Get an HTML attribute value
`/session/{id}/element/{eid}/property/{name}`	GET	Get a JavaScript property value
`/session/{id}/element/{eid}/css/{name}`	GET	Get a computed CSS property value
`/session/{id}/element/{eid}/rect`	GET	Get element's bounding rectangle
`/session/{id}/element/{eid}/enabled`	GET	Check if element is enabled
`/session/{id}/element/{eid}/selected`	GET	Check if element is selected
`/session/{id}/element/{eid}/displayed`	GET	Check if element is visible
`/session/{id}/element/{eid}/computedrole`	GET	Get computed ARIA role
`/session/{id}/element/{eid}/computedlabel`	GET	Get computed ARIA label

Shadow DOM

W3C Endpoint	Method	Description
`/session/{id}/element/{eid}/shadow`	GET	Get shadow root of a web component
`/session/{id}/shadow/{sid}/element`	POST	Find element inside a shadow root
`/session/{id}/shadow/{sid}/elements`	POST	Find all elements inside a shadow root

Frames

W3C Endpoint	Method	Description
`/session/{id}/frame`	POST	Switch to frame by index, element reference, or `null` for top
`/session/{id}/frame/parent`	POST	Switch to parent frame

Script Execution

W3C Endpoint	Method	Description
`/session/{id}/execute/sync`	POST	Execute synchronous JavaScript
`/session/{id}/execute/async`	POST	Execute asynchronous JavaScript with callback

Screenshots

W3C Endpoint	Method	Description
`/session/{id}/screenshot`	GET	Full page screenshot (base64 PNG)
`/session/{id}/element/{eid}/screenshot`	GET	Element screenshot (base64 PNG)

Cookies

W3C Endpoint	Method	Description
`/session/{id}/cookie`	GET	Get all cookies
`/session/{id}/cookie/{name}`	GET	Get a cookie by name
`/session/{id}/cookie`	POST	Add a cookie
`/session/{id}/cookie/{name}`	DELETE	Delete a cookie by name
`/session/{id}/cookie`	DELETE	Delete all cookies

Actions

W3C Endpoint	Method	Description
`/session/{id}/actions`	POST	Perform actions: key (keyDown/keyUp), pointer (move/down/up), wheel (scroll)
`/session/{id}/actions`	DELETE	Release all actions

MCP Integration

tauri-webdriver works with mcp-tauri-automation to enable AI-driven automation of Tauri apps via the Model Context Protocol. This lets AI agents (like Claude Code) launch, inspect, interact with, and screenshot your Tauri app.

# 1. Start tauri-wd
tauri-wd --port 4444

# 2. Configure mcp-tauri-automation in your MCP client
# The MCP server connects to tauri-wd on port 4444
# and exposes tools like launch_app, click_element, capture_screenshot, etc.

Note: This project recommends the danielraffel/mcp-tauri-automation fork which includes additional tools (execute_script, get_page_title, get_page_url, multi-strategy selectors, configurable screenshot timeouts, wait_for_navigation). PRs have been submitted upstream to Radek44/mcp-tauri-automation.

Architecture

See the full technical specification (SPEC.md) for:

Plugin HTTP API (all endpoints, request/response formats)
CLI W3C WebDriver endpoint mapping
JavaScript bridge internals (element identity, shadow DOM cache, frame context)
Session creation flow
Element state management
Error handling

License

MIT OR Apache-2.0 (dual-licensed, same as Tauri)

tauri-webdriver-automation 0.1.1