rsclaw-computer 0.1.0

Computer crate for RsClaw — internal workspace crate, not for direct use
docs.rs failed to build rsclaw-computer-0.1.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

Computer-use subsystem — RsClaw's GUI agent core.

Architecture (5 layers, top-down):

Layer A. Third-party primitives (enigo + xcap): Native cross-platform input synthesis and screen capture. Replaces the previous shell-based path (cliclick / PowerShell / sips / screencapture) with ~100x faster native APIs.

Layer B. Operator trait (operators/): Platform abstraction. Implementations: - NativeOperator — desktop (Mac/Win/Linux) via enigo+xcap - BrowserOperator — bridge to web_browser subsystem (CDP) - (future) AdbOperator — Android Each operator self-describes its capabilities via action_spaces() so the system prompt is built dynamically.

Layer C. Driver (driver.rs, parser.rs, prompt.rs): Model-agnostic AI loop. VlmDriver works with any vision model that follows the Thought/Action format (UI-TARS 1.0/1.5, Doubao, GPT-4o, Claude vision, Qwen-VL, ...). Coordinate parser is format-tolerant (4 formats: <|box_start|>, , (x,y), [x1,y1,x2,y2]).

Layer D. Permission gate (permission.rs): Pre-execution consent flow. Before any UI loop runs the backend emits a PermissionRequest event; the desktop UI surfaces a modal ("RsClaw is about to control WeChat, ~10 steps") and the user grants once / for the session / always (per-app) / denies. Decisions persist in redb.

Layer E. App rules (app_rules.rs, runtime data): Plain markdown files in tools/computer_use/app-rules/. Loaded at runtime, matched by keyword to the user's instruction, and injected into the system prompt. Adding a new app's automation knowledge does NOT require Rust changes.