Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
uzor-agent-api
Local HTTP control plane for any uzor app. Sits next to the
window manager (uzor-desktop, future uzor-web,
uzor-window-mobile) as a peer L4 helper, not part of it.
External agents (LLMs, QA tooling, scripts, IDE plugins) read live LM state and drive every UI operation LM understands — without screenshots when possible, with screenshots as the escape hatch when not.
The trait + types live in uzor::layout::agent::* (core).
This crate is the HTTP transport. Mobile / web window managers
implement the same AgentControl trait and reuse the routes.
Enable
new
.agent_api // local HTTP server on 127.0.0.1:17480
.window
.run?;
The server runs on its own tokio runtime in a dedicated thread. Bind failure (port taken) is non-fatal — logged to stderr, app keeps running.
Routes
| Method | Path | Body / query | Returns |
|---|---|---|---|
| GET | /health |
— | {"ok":true} |
| GET | /state/tree |
— | AgentSnapshot |
| GET | /state/widgets |
— | Vec<WidgetSnapshot> |
| GET | /log?since=N&limit=M&prefix=X |
— | Vec<AgentLogEntry> |
| GET | /log/tail?n=N |
— | Vec<AgentLogEntry> |
| GET | /screenshot/:window |
— | image/png |
| POST | /cmd |
Command (tagged JSON) |
CommandReply |
| POST | /input/click |
{window, x, y, button?} |
CommandReply |
| POST | /input/hover |
{window, x, y} |
CommandReply |
| POST | /input/scroll |
{window, dx, dy} |
CommandReply |
| POST | /lm/click_widget |
{window, widget_id} |
CommandReply |
| POST | /lm/hover_widget |
{window, widget_id} |
CommandReply |
| POST | /lm/modal/open |
{window, modal_id} |
CommandReply |
| POST | /lm/modal/close |
{window, modal_id} |
CommandReply |
| POST | /lm/popup/open |
{window, popup_id} |
CommandReply |
| POST | /lm/popup/close |
{window, popup_id} |
CommandReply |
| POST | /lm/dropdown/open |
{window, dropdown_id} |
CommandReply |
| POST | /lm/dropdown/close |
{window, dropdown_id} |
CommandReply |
| POST | /lm/sidebar/toggle |
{window, sidebar_id} |
CommandReply |
| POST | /window/spawn |
{key, title, width, height, ...} |
CommandReply |
| POST | /window/close |
{key} |
CommandReply |
| POST | /lm/sync_mode |
{node_id, mode, group_id?} |
CommandReply |
| POST | /lm/style_preset |
{name} |
CommandReply |
| POST | /lm/panel/resize_edge |
{window, panel_id, edge, delta_px} |
CommandReply |
| POST | /lm/panel/drag_separator |
{window, sep_idx, delta_px} |
CommandReply |
| POST | /lm/panel/set_rect |
{window, panel_id, x, y, width, height} |
CommandReply |
| GET | /blackboxes |
— | Vec<String> |
| GET | /blackbox/:slot/widgets |
— | Vec<AgentWidget> |
| GET | /blackbox/:slot/state |
— | free-form JSON |
| POST | /blackbox/:slot/action |
{name, args?} (AgentAction) |
AgentActionReply |
| POST | /blackbox/:slot/click_widget |
{sub_id, window?} |
CommandReply |
All POSTs apply on the WM event-loop thread (winit) within one tick of the request — replies are blocking.
Two operating modes
A. Semantic / direct LM ops (default)
Address things by stable id. No coordinates, robust to layout changes.
POST /lm/click_widget {window:"main", widget_id:"settings:theme_dark"}
POST /lm/modal/open {window:"main", modal_id:"settings"}
POST /lm/sidebar/toggle {window:"main", sidebar_id:"left"}
POST /lm/style_preset {name:"mirage_dark"}
POST /window/spawn {key:"side", title:"...", width:800, height:600}
POST /blackbox/chart-btc/action {name:"set_symbol", args:{symbol:"ETH"}}
B. Pixel + screenshot (escape hatch)
For blackbox bodies (charts, drawing) where the widget tree doesn't describe clickable regions.
POST /input/click {window:"main", x:420, y:110}
GET /screenshot/main → image/png
Screenshot first call patches the vello surface with COPY_SRC
- requests redraw and returns
503; the next call returns PNG.
Snapshot shape (GET /state/tree)
Merged event log (GET /log)
Anyone holding &mut LayoutManager writes into one feed —
LM internals (lm.*), the app (app.*), and blackbox handlers
(<slot_id>.*).
seq 2 lm.click settings:theme_light @ (206.5, 127)
seq 3 lm.dispatch Unhandled(theme_light)
seq 4 lm.agent_command ClickWidget ok=true
seq 5 lm.style.preset mirage_light
seq 6 app.theme.changed theme=light, preset=mirage_light
seq 7 tree-debug.toggle_synced_root {show_synced_root:false}
Filter by category prefix:
GET /log?prefix=lm.style.
GET /log?prefix=app.
GET /log?prefix=tree-debug.
GET /log?since=42&limit=200
GET /log/tail?n=50
Categories agreed by convention (no enforcement):
lm.*—lm.click,lm.dispatch,lm.style.preset,lm.sync_mode,lm.window.attach,lm.window.detach,lm.overlay.modal,lm.overlay.popup,lm.overlay.dropdown,lm.overlay.sidebar,lm.agent_command,lm.blackbox.register,lm.blackbox.unregisterapp.*— pushed by app code vialm.agent_log_push(category, payload)orlm.agent_log_note(message)<blackbox-slot-id>.*— pushed by the framework after a successful/blackbox/<slot>/actionreturns alog_payload
Blackboxes
A blackbox is a panel whose body uses custom rendering / coordinate-clicks (chart canvas, drawing tool, DOM ladder). LM sees only its outer rect; everything inside is opaque.
To make it agent-controllable, the blackbox state struct
implements BlackboxAgentSurface:
use ;
Then register it from App::init:
let chart_state = new;
layout.register_blackbox_agent;
From now on:
GET /blackboxes→["chart-btc", ...]GET /blackbox/chart-btc/widgets→ published hot-zonesGET /blackbox/chart-btc/state→{symbol, crosshair, bars}POST /blackbox/chart-btc/action {name:"set_symbol", args:{symbol:"ETH"}}POST /blackbox/chart-btc/click_widget {sub_id:"btn:zoom_in"}GET /log?prefix=chart-btc.→ every action'slog_payload
LM sees nothing chart-specific. Each blackbox is a self-contained sandbox; an agent can drive several simultaneously without them stepping on each other.
AgentActionReply::ok_with_log(payload) makes the framework
push a <slot>.<action_name> entry into /log. Use
AgentActionReply::ok() if you don't want a log entry.
Layering
uzor (core)
└── layout::agent
├── AgentControl trait ← what every WM implements
├── Command / Reply ← write vocabulary
├── Snapshot / Widget ← read shape
├── AgentLog (ring buf) ← merged feed
├── BlackboxAgentSurface ← per-blackbox sandbox contract
└── LmAgent<P> ← default implementation of every op
routable through `LayoutManager`
uzor-agent-api (this crate)
└── axum + tokio HTTP shim over `Arc<dyn AgentControl>`
uzor-desktop::Manager
└── implements AgentControl via:
├── Arc<RwLock<AgentSnapshot>> rebuilt every tick
├── Arc<RwLock<Vec<WidgetSnapshot>>>
├── Arc<RwLock<Vec<AgentLogEntry>>>
├── Arc<RwLock<HashMap<slot, Arc<Mutex<dyn ...>>>>>
├── std::sync::mpsc<(Command, Sender<Reply>)> drained in about_to_wait
└── std::sync::mpsc<ScreenshotRequest> drained in about_to_wait
Threading
- HTTP server lives on its own tokio multi-thread runtime.
dispatch(Command)is sync — server handlers run it onspawn_blockingso a blocked WM tick doesn't pin a tokio worker.- Commands apply on the winit thread inside
about_to_waitbefore the next solve, so the nextGET /state/treealready reflects the change. - Blackbox locks are short — a single
Mutexper blackbox, taken by HTTP handlers for the duration ofwidgets()/state()/apply_agent_action().
Recipe: agent-driven smoke test
# 1. Discover what's on screen
|
|
# 2. Find the theme buttons
|
# 3. Switch theme by id (no pixel coords)
# 4. Replay the cause-and-effect
|
# 5. Drive a blackbox without knowing its internals
Status
First-pass. API is stable enough for internal agents but will grow. Planned:
POST /lm/dock/split,POST /lm/edge/add— direct dock/edge manipulation (currently only via app layer).POST /input/key//input/text— keyboard injection.POST /lm/sync_modewrite semantics — currently flips the classification tag; actual storage relocation between shared/per-window is a future pass.- WebSocket subscription on
/logfor push-mode instead of polling. - Authentication (bearer token) before binding to non-loopback.
Why a peer L4 helper, not a window-manager feature
WMs (uzor-desktop, uzor-window-web, uzor-window-mobile)
host the LM and own the event loop. The agent API is
strictly one more consumer of LM — it implements
AgentControl against the same LM the WM owns. The same
trait will work on iOS/Android (HTTP listening on
127.0.0.1) and in browsers (extension talking to
AgentControl exposed on the JS side).
LM is the boss. WMs and the agent API are tools that read / write LM by contract.