eoka-agent 0.1.2

AI agent interaction layer for browser automation — MCP server, observe/act loop
Documentation

eoka-agent

AI agent interaction layer for browser automation. Use directly in Rust or via MCP server.

Part of the eoka-tools workspace.

Quick Start

use eoka_agent::Session;

let mut session = Session::launch().await?;
session.goto("https://example.com").await?;

// Observe → act by index → repeat
session.observe().await?;
println!("{}", session.element_list());
session.click(0).await?;

session.close().await?;

Features

  • observe() — enumerate all interactive elements with Shadow DOM support
  • element_list() — compact text format for LLM consumption: [0] <button> "Submit"
  • screenshot() — annotated PNG with numbered red boxes on each element
  • Index-based actionsclick(i), fill(i, text), select(i, value), hover(i)
  • Auto-wait — actions wait for network idle and DOM stability
  • Stale detection — detects moved/removed elements with helpful error messages

Element List Format

[0] <input type="text"> placeholder="Customer name"
[1] <input type="tel"> placeholder="Telephone"
[2] <button> "Submit"

Live Targeting

Actions support both index-based and live targeting:

click(target: "2")              // By index from observe()
click(target: "text:Submit")    // By visible text
click(target: "css:button.primary")  // By CSS selector
click(target: "id:submit-btn")  // By element ID
click(target: "placeholder:Email")   // By placeholder text
click(target: "role:button")    // By tag or ARIA role

Live targets resolve at action time via JS injection, avoiding stale element issues in dynamic pages.

MCP Server

The crate includes an MCP server binary for use with Claude Desktop, Claude Code, etc.

Setup

# Install
cargo install eoka-agent

# Add to Claude Code
claude mcp add eoka-agent -- eoka-agent

Tools

Tab Management:

Tool Description
list_tabs List all open tabs with IDs, titles, URLs
new_tab Open new tab (optionally with URL)
switch_tab Switch to tab by ID
close_tab Close tab by ID

Navigation:

Tool Description
navigate Go to URL (launches browser on first call)
back / forward Browser history navigation
spa_info Detect SPA router (React, Next.js, Vue, etc.)
spa_navigate Navigate SPA without page reload

Observation:

Tool Description
observe List interactive elements (filter by type, limit count)
screenshot Annotated screenshot with numbered elements
find_text Search elements by text content
page_text Get visible text content
page_info Get current URL and title

Actions (support live targeting: text:Submit, css:button, id:btn):

Tool Description
click Click element by index or live target
fill Type into input field
select Select dropdown option
hover Hover over element
scroll Scroll page or element into view
type_key Press keyboard key (Enter, Tab, etc.)
batch Execute multiple actions in one call

Other:

Tool Description
extract Run JavaScript and return result
cookies Get all cookies
set_cookie Set a cookie
close Close browser

Examples

cargo run -p eoka-agent --example demo