macos-agent
macos-agent is a macOS-oriented CLI for agent desktop automation. It provides parseable primitives for discovery, observation, and input
actions: window/app listing, window activation, click, type, hotkey, AX (Accessibility) actions, input-source switching, screenshot, and
wait helpers.
Quick Start
# readiness check
# list targets
# activate + input
# ax-first interaction
# observation
# stabilization waits
# gate + postcondition for mutating AX actions
# selector-frame screenshot
# diff-aware screenshot publish
# one-shot debug bundle
Command Surface
preflightmacos-agent preflight [--strict] [--include-probes]- JSON output includes
result.permissionswith unified fields:screen_recording,accessibility,automation,ready,hints.
windowsmacos-agent windows list [--app <name>] [--window-title-contains <name>] [--on-screen-only]
appsmacos-agent apps list
windowmacos-agent window activate (--window-id <id> | --active-window | --app <name>[--window-title-contains <name>] | --bundle-id <bundle_id>) [--wait-ms <ms>] [--reopen-on-fail]
inputmacos-agent input click --x <px> --y <px> [--button <left|right|middle>] [--count <n>] [--pre-wait-ms <ms>] [--post-wait-ms <ms>]macos-agent input type --text <text> [--delay-ms <ms>] [--submit]macos-agent input hotkey --mods <cmd,ctrl,alt,shift,fn> --key <key>
input-sourcemacos-agent input-source currentmacos-agent input-source switch --id <source_id|abc|us>
axmacos-agent ax list [--session-id <id> | --app <name> | --bundle-id <bundle_id>][--window-title-contains <text>] [--role <AXRole>] [--title-contains <text>] [--identifier-contains <text>][--value-contains <text>] [--subrole <AXSubrole>] [--focused <bool>] [--enabled <bool>] [--max-depth <n>] [--limit <n>]macos-agent ax click [selector flags...] [target flags...] [--match-strategy <contains|exact|prefix|suffix|regex>][--selector-explain] [--reselect-before-click] [--allow-coordinate-fallback][--fallback-order <ax-press,ax-confirm,frame-center,coordinate>] [--gate-app-active] [--gate-window-present][--gate-ax-present] [--gate-ax-unique] [--wait-timeout-ms <ms>] [--wait-poll-ms <ms>][--gate-timeout-ms <ms>] [--gate-poll-ms <ms>] [--postcondition-focused <bool>][--postcondition-attribute <AXAttr>] [--postcondition-attribute-value <value>][--postcondition-timeout-ms <ms>] [--postcondition-poll-ms <ms>]macos-agent ax type [selector flags...] [target flags...] --text <text>[--match-strategy <contains|exact|prefix|suffix|regex>] [--selector-explain] [--clear-first] [--submit] [--paste][--allow-keyboard-fallback] [--gate-app-active] [--gate-window-present] [--gate-ax-present] [--gate-ax-unique][--wait-timeout-ms <ms>] [--wait-poll-ms <ms>] [--gate-timeout-ms <ms>] [--gate-poll-ms <ms>][--postcondition-focused <bool>] [--postcondition-attribute <AXAttr>] [--postcondition-attribute-value <value>][--postcondition-timeout-ms <ms>] [--postcondition-poll-ms <ms>]macos-agent ax attr get [selector flags...] [target flags...] --name <AXAttribute>macos-agent ax attr set [selector flags...] [target flags...] --name <AXAttribute> --value <value> [--value-type <string|number|bool|json|null>]macos-agent ax action perform [selector flags...] [target flags...] --name <AXAction>macos-agent ax session start [--session-id <id>] [--app <name> | --bundle-id <bundle_id>] [--window-title-contains <text>]macos-agent ax session listmacos-agent ax session stop --session-id <id>macos-agent ax watch start --session-id <id> [--watch-id <id>] [--events <comma-separated-AX-notifications>] [--max-buffer <n>]macos-agent ax watch poll --watch-id <id> [--limit <n>] [--drain|--no-drain]macos-agent ax watch stop --watch-id <id>
observemacos-agent observe screenshot (--window-id <id> | --active-window | --app <name> [--window-title-contains <name>])[--path <file>] [--image-format <png|jpg|webp>] [--if-changed] [--if-changed-baseline <path>][--if-changed-threshold <bits>] [selector flags...] [--selector-padding <px>]
debugmacos-agent debug bundle [--window-id <id> | --active-window | --app <name> [--window-title-contains <name>]] [--output-dir <path>]
waitmacos-agent wait sleep --ms <ms>macos-agent wait app-active (--app <name> | --bundle-id <bundle_id>) [--timeout-ms <ms>] [--poll-ms <ms>]macos-agent wait window-present (--window-id <id> | --active-window | --app <name> [--window-title-contains <name>])[--timeout-ms <ms>] [--poll-ms <ms>]macos-agent wait ax-present [selector flags...] [target flags...] [--timeout-ms <ms>] [--poll-ms <ms>]macos-agent wait ax-unique [selector flags...] [target flags...] [--timeout-ms <ms>] [--poll-ms <ms>]
scenariomacos-agent scenario run --file <scenario.json>
profilemacos-agent profile validate --file <profile.json>macos-agent profile init [--name <profile-name>] [--path <output.json>]
completionmacos-agent completion <bash|zsh|fish|powershell|elvish>(prints shell completion script tostdout)
Global Flags
--format <text|json|tsv>--error-format <text|json>--dry-run--retries <n>--retry-delay-ms <ms>--timeout-ms <ms>--trace--trace-dir <path>
Notes:
--format tsvis only supported bywindows listandapps list.- Canonical flags: use
--window-title-containsandinput type --submit. --dry-runguarantees no OS automation command execution for mutating actions.--error-format jsonemits machine-parseable error payloads onstderr.--tracewrites per-command trace artifacts toAGENT_HOME/out/macos-agent-trace/.--trace-diroverrides trace artifact output directory.- When trace mode is enabled,
macos-agentverifies trace directory writability before running actions.
Output Contract
- Success:
- Writes payload to
stdoutonly. stderrremains empty.
- Writes payload to
- Error:
- Writes message to
stderronly. stdoutremains empty.- Messages start with
error:.
- Writes message to
JSON envelope (--format json):
Preflight permission contract (macos-agent --format json preflight):
Mutating action commands (window activate, input click, input type, input hotkey, ax click, ax type) always include
result.policy in JSON output so agent-side retry and timeout policy can be parsed without guessing defaults. These action results also
include result.meta.attempts_used so flaky steps can be detected quickly.
Exit codes:
0: success1: runtime failure2: usage error
Platform guard: macos-agent is macOS-only. On non-macOS hosts every subcommand short-circuits with exit code 2 and the
message error: macos-agent is only supported on macOS. The deterministic test mode (AGENTS_MACOS_AGENT_TEST_MODE=1) bypasses
this guard so CI-safe integration tests can run on Linux.
Error envelope (--error-format json):
Permission Matrix
| Capability | Required setup | Typical failure symptom | Mitigation |
|---|---|---|---|
| Accessibility | Terminal host allowed in System Settings > Privacy & Security > Accessibility | click/type/hotkey fail | Enable the shell host app (Terminal/iTerm/etc.) and retry |
| Automation (Apple Events) | Terminal host allowed in System Settings > Privacy & Security > Automation | activation / System Events probe fails | Allow the terminal app to control System Events |
| Screen Recording | Terminal host allowed in System Settings > Privacy & Security > Screen Recording | observe screenshot fails | Enable Screen Recording for terminal host |
osascript binary |
Preinstalled on macOS; required for AppleScript backend + preflight probes | preflight reports missing osascript |
Reinstall macOS command-line tools if missing (xcode-select --install) |
cliclick binary |
Installed and on PATH |
preflight reports missing cliclick |
brew install cliclick |
hs (Hammerspoon CLI) |
Required for Hammerspoon AX backend; install Hammerspoon and enable hs.ipc |
ax attr/action/session/watch fail with backend-unavailable hint |
brew install --cask hammerspoon, then add require('hs.ipc') to ~/.hammerspoon/init.lua |
im-select binary |
Required by input-source current and input-source switch |
input-source commands fail with missing dependency im-select |
brew install im-select |
See the workspace BINARY_DEPENDENCIES.md for the canonical install matrix (rows for hs, cliclick, im-select, and osascript).
AX Backend Capability Matrix
| Backend preference | ax list/click/type |
ax attr/action/session/watch |
Notes |
|---|---|---|---|
auto (default) |
Hammerspoon hs first; falls back to AppleScript (JXA) only when the Hammerspoon path is missing |
Hammerspoon-only | Best default for resilience; fallback does not apply to extended AX commands and surfaces a "fallback" hint |
hammerspoon / hs |
Hammerspoon-only (no JXA fallback) | Supported | Full AX surface; requires the hs CLI on PATH and require('hs.ipc') enabled in ~/.hammerspoon/init.lua |
applescript / jxa |
AppleScript (JXA via osascript) only |
Not supported directly | Extended AX commands still require Hammerspoon; selecting this preference does not avoid the dependency |
Preflight emits an ax_backend_capabilities row so operators can verify backend mode and fallback expectations before failures.
The row carries the AX backend preference=<auto|hammerspoon|applescript> message; when the preference is anything other than
hammerspoon, the row also adds a hint pointing operators back to AGENTS_MACOS_AGENT_AX_BACKEND and
preflight --include-probes.
Reliability Boundaries and Practices
Desktop UI automation is inherently brittle due to animation timing, focus drift, and app responsiveness. Use these defaults for better stability:
- Always activate context before input:
window activate ... --wait-ms 1000
- Add small waits around click chains:
input click ... --pre-wait-ms 100 --post-wait-ms 100
- Enable retries for transient failures:
--retries 2 --retry-delay-ms 150
- Keep timeouts explicit for slow apps:
--timeout-ms 5000
- Use
wait app-active/wait window-presentbefore mutating actions. - Prefer
ax click/typefirst, then opt in to fallback flags when app AX trees are unstable. - AX backend selection defaults to
auto(HammerspoonhsCLI first, AppleScript JXA fallback forax list/click/type).- Override with
AGENTS_MACOS_AGENT_AX_BACKEND=hammerspoon|applescript|auto. - Aliases
hsandjxaare also accepted (mapped tohammerspoonandapplescriptrespectively). - Extended AX commands (
attr,action,session,watch) always require Hammerspoon; theapplescriptandautopaths do not provide a JXA fallback for them. - In deterministic test mode (
AGENTS_MACOS_AGENT_TEST_MODE=1) without an explicit override, the resolver picksapplescriptto keep the JXA stub deterministic.
- Override with
Command Decision Matrix (AX/Input/Wait/Fallback/Backend)
Use this matrix to pick commands consistently. Start from the decision row, then use the mapped troubleshooting row on failure.
| Decision ID | When | Command choice (ax/input/wait) |
Fallback policy | Backend policy | Troubleshooting row |
|---|---|---|---|---|---|
D1 |
Target element is discoverable in AX tree | ax list -> ax click / ax type; gate with wait app-active and (if needed) wait window-present |
Keep fallback flags off first | auto default is preferred; see AX Backend Capability Matrix |
T3, T5 |
D2 |
AX selector exists but can be unstable across reruns | Same as D1, plus --allow-coordinate-fallback or --allow-keyboard-fallback; keep wait gates explicit |
Opt in per command (ax click/type only) |
Keep auto so ax click/type can fall back to JXA when Hammerspoon is unavailable |
T4, T5 |
D3 |
AX path is unavailable for the target app | window activate + input click / input type / input hotkey; use wait app-active/window-present before mutation |
No AX fallback path; use coordinate/keyboard input directly | Backend-independent for pure input flow |
T1, T2 |
D4 |
Need extended AX operations (attr, action, session, watch) |
Use ax attr/action/session/watch commands; add wait gate before mutating action |
No fallback support for extended AX commands | Requires Hammerspoon runtime support (see AX Backend Capability Matrix) |
T5 |
D5 |
Text entry depends on deterministic keyboard layout | input-source current -> input-source switch --id <id> -> ax type or input type |
Prefer paste/submit flow when IME variance is high | Backend-independent for input-source; AX typing still follows D1/D2 backend rules |
T6 |
This AX-first + fallback policy avoids brittle coordinate-only flows while keeping a reliable escape hatch.
Debug Bundle Triage Flow
Copy-paste triage flow to collect deterministic artifacts after a flaky or failed run:
OUT="/out/macos-agent-debug-"
# 1) capture debug bundle + artifact index
# 2) inspect artifact index and partial failures
# 3) optional selector-frame screenshot for visual targeting proof
Artifact index notes:
result.artifact_index_pathpoints to the canonical artifact index JSON.result.partial_failure=truemeans some artifacts failed but bundle capture still completed.- Each artifact entry records
id,ok,path, anderrorfor fast triage routing.
Deterministic Test Mode
Set AGENTS_MACOS_AGENT_TEST_MODE=1 to run with deterministic fixtures and without controlling the real desktop. This mode is used by
CI-safe integration tests.
Opt-in Real macOS E2E Checks
crates/macos-agent/tests/e2e_real_macos.rs contains real-desktop checks for:
- TCC signal quality in
preflight(Accessibility/Automation statuses + hints) - focus drift detection path for activation +
wait app-active
crates/macos-agent/tests/e2e_real_apps.rs contains app workflow checks for:
- Finder activation + window presence + navigation hotkeys + screenshot evidence
- Arc YouTube flow (open home, click 3 videos, play/pause, comment checkpoint)
- Spotify flow (UI track click, play/pause toggles, player-state probe)
- Cross-app Arc↔Spotify focus recovery and matrix artifact index output
These checks are disabled by default and require explicit opt-in:
MACOS_AGENT_REAL_E2E=1
MACOS_AGENT_REAL_E2E=1 MACOS_AGENT_REAL_E2E_MUTATING=1 MACOS_AGENT_REAL_E2E_APP=Finder \
MACOS_AGENT_REAL_E2E=1 MACOS_AGENT_REAL_E2E_MUTATING=1 MACOS_AGENT_REAL_E2E_APPS=finder \
MACOS_AGENT_REAL_E2E=1 MACOS_AGENT_REAL_E2E_MUTATING=1 MACOS_AGENT_REAL_E2E_APPS=arc,spotify,finder \
MACOS_AGENT_REAL_E2E_PROFILE=default-1440p \
Real-app E2E environment variables:
MACOS_AGENT_REAL_E2E=1: enable real desktop tests.MACOS_AGENT_REAL_E2E_MUTATING=1: allow mutating desktop actions (click/type/hotkey).MACOS_AGENT_REAL_E2E_APPS=arc,spotify,finder: select app subset in deterministic order.- Unsupported app names are treated as configuration errors (fail fast).
MACOS_AGENT_REAL_E2E_PROFILE=default-1440p: choose coordinate profile fixture.MACOS_AGENT_REAL_E2E_INPUT_SOURCE=com.apple.keylayout.ABC(orabc): optional; if set, tests switch to the target input source once viaim-selectbefore text-entry flows.MACOS_AGENT_REAL_E2E_STEP_TIMEOUT_MS=15000: optional per-step timeout guard for real-app helper commands.MACOS_AGENT_REAL_E2E_ITERATIONS=5: optional short-loop repetition count for matrix runs.
Input-method notes for reliability:
- Arc YouTube navigation uses address-bar focus + clipboard paste +
Return(not per-key character typing), then verifies the active URL containsyoutube.comand is not a Google search URL. - Spotify search input uses clipboard paste (
Cmd+A+Cmd+V) and thenReturn, avoiding IME-dependent character typing. - If you want deterministic keyboard layout, install
im-select(brew install im-select) and setMACOS_AGENT_REAL_E2E_INPUT_SOURCE=abc. - You can verify/switch layout directly with:
macos-agent --format json input-source currentmacos-agent --format json input-source switch --id abc
Real-app artifact notes:
- Every real-app scenario writes
steps.jsonlandstep-summary.jsonunder its artifact directory. artifact-index.jsonincludes per-scenariostep_ledger_path,failing_step_id, andlast_successful_step_id.- Real-app checks are manual/local validation flows and should not be included in default CI jobs.
Immediate Feedback Loop
Workflow 1: readiness then action probe
Workflow 2: machine-parseable failure triage
# Read latest trace in AGENT_HOME/out/macos-agent-trace/
Workflow 3: iterate with scenario file + profile checks
Troubleshooting matrix
Use the Decision ID from Command Decision Matrix to choose the row quickly.
| ID | Symptom | Next command | What to inspect | Decision row |
|---|---|---|---|---|
T1 |
not authorized or Apple Events failures |
macos-agent --format json preflight --include-probes |
error.hints, Automation/Accessibility rows |
D3 |
T2 |
Flaky click/input behavior | macos-agent --trace --error-format json input click ... |
latest trace JSON (attempts_used, timeout/retry policy) |
D3 |
T3 |
AX selector no match / ambiguous match | macos-agent --format json ax list --app <name> --role <AXRole> --title-contains <text> |
node candidates (node_id, role, title, identifier) and refine selector / --nth |
D1 |
T4 |
AX press/type fails but coordinate/keyboard path should continue | rerun with ax click --allow-coordinate-fallback or ax type --allow-keyboard-fallback |
whether used_coordinate_fallback / used_keyboard_fallback is true in JSON result |
D2 |
T5 |
Hammerspoon AX backend unavailable | hs -t 1 -q -c 'return \"ok\"' |
ensure Hammerspoon is running and require('hs.ipc') is enabled, or keep backend auto for JXA fallback |
D1, D2, D4 |
T6 |
Input source mismatch before typing | macos-agent --format json input-source current then ... switch --id abc |
current source id and switch result (switched=true) |
D5 |
T7 |
Trace enabled but command does not start | macos-agent --trace --trace-dir <path> --error-format json preflight |
trace.write error and writable-path hint |
D3 |
T8 |
Real-app scenario failed mid-flow | run target e2e_real_apps command with --nocapture |
steps.jsonl, step-summary.json, artifact-index.json |
D1, D2, D3 |
T9 |
Profile coordinate drift | macos-agent profile validate --file <profile.json> |
key-path validation errors and bounds issues | D3 |