x0x 0.19.46 - Docs.rs

# x0x exec — remote command execution over the mesh (Tier 1)

**Status:** Tier-1 implementation in progress — protocol, ACL, daemon routing, REST/CLI, diagnostics, and JSONL audit are wired behind an opt-in ACL.
**Filed:** 2026-05-01
**Owner:** dev team (assigned by lead)
**Trigger:** Replace SSH-per-call in `e2e_vps.sh` with a daemon-native execution path. Generalise the existing `tests/runners/x0x_test_runner.py` control-plane primitive into a first-class daemon feature.
**Companion doc:** [`x0x-terminal.md`](x0x-terminal.md) (Tier 2, deferred)

---

## 1. Goal

Allow a local x0x agent to run a strictly-allowlisted, non-interactive command on a remote x0x daemon, with stdout/stderr/exit-code streamed back over the existing direct-message channel.

**In scope (this doc):** non-interactive command execution. No PTY, no shell interpolation, no signal forwarding from the client beyond cancel.

**Out of scope:** interactive terminal, full shell, file editing, port forwarding. See `x0x-terminal.md` for the Tier 2 design that may or may not follow.

**Why now:** the existing `e2e_vps.sh` is dominated by SSH RTT to Singapore/Sydney (~4 s per call). The mesh-driven `e2e_vps_mesh.py` solved this for DM testing by running through x0x's own pubsub. `x0x exec` is the same trick generalised — drop the SSH dependency from the diagnostic and management paths too.

## 2. Constraints (locked by lead, 2026-05-01)

These are not open for negotiation in v1:

1. **ACL keys on (AgentId, MachineId) pairs.** A stolen agent key on a different machine is rejected at the trust layer before reaching the ACL check.
2. **Strict argv allowlist.** Each ACL entry specifies the exact argv vector the agent may execute. Limited templating (see §6.2). No regex, no shell globs.
3. **No shell interpolation, ever.** The remote daemon calls `tokio::process::Command::new(argv[0]).args(&argv[1..])`. Never `/bin/sh -c`. Never `bash -c`.
4. **Hard caps on output, duration, concurrency.** Cap breaches emit a `Warning` frame to the requester *and* are logged to `/diagnostics/exec` so an operator interrogating the remote machine sees them.
5. **Audit trail in a CRDT TaskList** (in addition to a local log file). Append-only, signed, replicated to the requesting agent.
6. **Client disconnect → SIGTERM** the remote process within 5 s, then SIGKILL.
7. **ACL lives at `/etc/x0x/exec-acl.toml`** and changes require a full daemon restart. No hot-reload.

## 3. Architecture

```text
local x0x CLI
  ↓ HTTP (POST /exec/run)
local x0xd
  ↓ signed/encrypted gossip-DM frames (ML-KEM + ChaCha20-Poly1305, ACK'd)
remote x0xd
  ↓ trust check: verified sender + TrustDecision::Accept
  ↓ ACL check: AgentId + MachineId + argv match
  ↓ tokio::process::Command::new(argv[0]).args(&argv[1..])
remote child process
```

No new transport. No new crypto. Reuses the existing direct-message envelope, which already gives us:

- ML-DSA-65 signature verification of the requester's `AgentId`.
- Trust-evaluated delivery (`TrustEvaluator`).
- ACK'd, in-order delivery on a per-peer channel.
- `/diagnostics/dm` visibility for the underlying transport.

The exec service is a new direct-message *kind* that the existing DM dispatcher routes to a new handler.

### 3.1 Why direct DMs and not a new QUIC stream

For Tier 1, the volume is small (≤16 MB stdout per session, ≤32 concurrent sessions) and latency is dominated by command execution time, not framing overhead. The existing DM path works. If profiling later shows the ACK-per-frame overhead is the bottleneck for streaming stdout, we can split: control frames over DMs, bulk stdout over a side QUIC unidirectional stream (mirroring how file transfer works). Don't pre-optimise.

The Tier 2 interactive path *will* need bidirectional QUIC streams for backpressure correctness — that's the whole point of separating the two tiers.

## 4. Wire protocol

Exec uses a stable payload prefix (`x0x-exec-v1\\0`) carried inside the existing encrypted DM plaintext. The prefix lets `dm_inbox` route exec frames before generic `/direct/events` fan-out without a backwards-incompatible envelope change. The bytes after the prefix are a bincode-encoded `ExecFrame`:

```rust
#[derive(Serialize, Deserialize)]
pub enum ExecFrame {
    /// Client → server: kick off a session.
    Request {
        request_id: Uuid,           // client-allocated, unique per local agent
        argv: Vec<String>,          // tokens, no shell metacharacters
        stdin: Option<Vec<u8>>,     // ≤ max_stdin_bytes (server cap)
        timeout_ms: u32,            // clamped to max_duration_secs
        cwd: Option<String>,        // v1 rejects requester-controlled cwd
    },

    /// Server → client: process spawned, here is its OS pid.
    Started { request_id: Uuid, pid: u32 },

    /// Server → client: stdout/stderr chunk. seq is monotonic per stream.
    Stdout { request_id: Uuid, seq: u32, data: Vec<u8> },
    Stderr { request_id: Uuid, seq: u32, data: Vec<u8> },

    /// Server → client: a soft event the operator should see (cap warning,
    /// truncation, etc). Does not terminate the session.
    Warning { request_id: Uuid, kind: WarningKind, message: String },

    /// Server → client: terminal frame. Always the last frame for a session.
    Exit {
        request_id: Uuid,
        code: Option<i32>,          // None if killed by signal
        signal: Option<i32>,        // Unix signal number, or None
        duration_ms: u64,
        stdout_bytes_total: u64,    // including any truncated bytes
        stderr_bytes_total: u64,
        truncated: bool,            // true if any cap was hit
        denial_reason: Option<DenialReason>,  // Some(_) → request never ran
    },

    /// Client → server: renew the short session lease.
    LeaseRenew { request_id: Uuid },

    /// Client → server: cancel an in-flight session.
    Cancel { request_id: Uuid },
}

#[derive(Serialize, Deserialize)]
pub enum WarningKind {
    StdoutCapHit,                   // bytes-per-stream limit reached
    StderrCapHit,
    DurationApproachingCap,         // emitted at warn_duration_secs
    StdoutApproachingCap,           // emitted at warn_stdout_bytes
}

#[derive(Serialize, Deserialize)]
pub enum DenialReason {
    ExecDisabled,                   // [exec].enabled = false
    AgentMachineNotInAcl,           // (agent_id, machine_id) pair has no entry
    ArgvNotAllowed,                 // argv didn't match any allowlist entry
    StdinTooLarge,
    TimeoutTooLarge,
    CwdNotAllowed,
    ConcurrencyLimitReached,
    ShellMetacharInArgv,            // an argv token contained a forbidden char
}
```

Frame ordering on a single direct channel is preserved by the existing DM path (per-peer mpsc), so `seq` exists only for client-side correlation across a possible reconnect (future-proofing — not needed for v1 but free to include).

## 5. Authorization flow

Server-side, for every received `ExecFrame::Request`:

1. **MachineId check.** The DM was delivered over a QUIC connection; `network.rs` already binds the connection to a verified peer `MachineId`. If the connection's MachineId mismatches the agent's announced MachineId, reject at the DM layer (existing `TrustEvaluator::RejectMachineMismatch` path) — never reaches exec.
2. **AgentId check.** The DM envelope is ML-DSA-65-signed by the sender's agent key; this is verified in the existing DM path. If signature fails → existing path rejects. The exec handler trusts that the AgentId on a delivered DM is authentic.
3. **ACL lookup.** Find the first `[[exec.allow]]` entry matching `(agent_id, machine_id)`. If none → respond with `Exit { denial_reason: Some(AgentMachineNotInAcl), ... }`. Tier 1 returns structured denial reasons on-wire for operator/testability; every denial is also recorded in the remote JSONL audit log and `/diagnostics/exec`.
4. **argv match.** Walk the matched entry's commands, accept the first match (§6.2). On no match → `ArgvNotAllowed`.
5. **Metachar reject.** Even though we don't shell out, every argv token is checked for forbidden characters: `;`, `|`, `&`, `>`, `<`, `` ` ``, `$`, newline, null byte. Catches operator confusion and gives one extra layer if the allowlist is mis-authored. On hit → `ShellMetacharInArgv`.
6. **Caps.** stdin size, timeout, and v1 cwd rejection. On breach → corresponding `DenialReason`.
7. **Concurrency check.** If active sessions for this AgentId ≥ `max_concurrent_per_agent`, or total ≥ `max_concurrent_total`, deny.
8. **Spawn.** `tokio::process::Command` with `kill_on_drop(true)`, `stdin/stdout/stderr` piped, `current_dir(default_cwd)` when configured, environment scrubbed to a minimal allowlist (see §6.3).
9. **Stream.** Read stdout/stderr concurrently, send `Stdout`/`Stderr` frames as data arrives. Watch caps. Emit `Warning` at the warn thresholds.
10. **Terminate.** On exit (or cancel, or duration cap), send `Exit` with full stats. Audit-log the close.

The on-wire denial response is **always** the same shape (an `Exit` frame with `denial_reason: Some(_)`); only the local audit log distinguishes the cases.

## 6. ACL file format

`/etc/x0x/exec-acl.toml` (Linux) / `/usr/local/etc/x0x/exec-acl.toml` (macOS). Override path with `x0xd --exec-acl <path>` (intended for tests).

If file is missing → exec is disabled (`enabled = false` is the implicit default).
If file is present but malformed → daemon **refuses to start**. Fail-closed.

### 6.1 Schema

```toml
[exec]
enabled = false                    # default; must be explicit-true to enable

# Hard caps. Server-side enforced. Requests exceeding any cap are denied
# (for stdin/timeout) or truncated (for stdout/stderr).
max_stdout_bytes        = 16_777_216     # 16 MB
max_stderr_bytes        = 16_777_216     # 16 MB
max_stdin_bytes         = 1_048_576      # 1 MB
max_duration_secs       = 300            # 5 min
max_concurrent_per_agent = 4
max_concurrent_total     = 32

# Warning thresholds. Emit a Warning frame and log to /diagnostics/exec
# when crossed. Process keeps running.
warn_stdout_bytes       = 8_388_608      # 8 MB (50% of cap)
warn_duration_secs      = 60             # 20% of default cap

# Default working directory if a request omits cwd. If unset and request
# omits cwd, the process inherits the daemon's cwd.
default_cwd             = "/var/lib/x0x"

# Audit log path. Always written. Required.
audit_log_path          = "/var/log/x0x/exec.log"

# Optional: CRDT TaskList ID to mirror audit entries to. If set, every
# exec event (Request, Started, Exit, Denial) appends a TaskItem to this
# list, replicated to the requesting agent automatically.
audit_tasklist_id       = "01HZX..."     # ULID of an existing TaskList

# --- ACL entries below ---

[[exec.allow]]
description  = "ops-laptop@admin (David)"   # human label, appears in audit
agent_id     = "abc123...64hex"
machine_id   = "def456...64hex"
# Optional: per-entry overrides for caps. None → use [exec] caps.
# max_duration_secs = 30

# Strict argv allowlist for THIS (agent, machine) pair.
# Each [[exec.allow.commands]] entry is one allowed call shape.
[[exec.allow.commands]]
argv = ["systemctl", "status", "x0xd"]      # exact match only

[[exec.allow.commands]]
argv = ["journalctl", "-u", "x0xd", "-n", "<INT>"]
# <INT> is a hardcoded template token: matches a positive integer
# (regex equivalent: ^[1-9][0-9]{0,5}$, max 999_999). No user-supplied
# regexes. Other tokens: <URL_PATH> = ^/[A-Za-z0-9/_.-]{0,256}$ (no "..").

[[exec.allow.commands]]
argv = ["curl", "-s", "http://127.0.0.1:12600<URL_PATH>"]

[[exec.allow.commands]]
argv = ["cat", "/etc/x0x/config.toml"]

# Multiple entries per agent are fine; first match wins.
[[exec.allow]]
description = "ci-runner@github (read-only)"
agent_id    = "ghi789..."
machine_id  = "jkl012..."

[[exec.allow.commands]]
argv = ["x0x", "diagnostics", "dm"]

[[exec.allow.commands]]
argv = ["x0x", "diagnostics", "gossip"]
```

### 6.2 argv matching algorithm

1. Lengths must match. `argv = ["a", "b"]` does not match a 3-token request.
2. Walk left-to-right. For each `(allow_token, request_token)`:
   - If `allow_token` is a literal string → must equal `request_token` byte-for-byte.
   - If `allow_token` is `<INT>` → `request_token` must match `^[1-9][0-9]{0,5}$` (1–999_999).
   - If `allow_token` is `<URL_PATH>` → `request_token` must match the URL_PATH regex above. If `<URL_PATH>` appears as a *suffix* of a literal token (e.g. `"http://127.0.0.1:12600<URL_PATH>"`), the request token must start with the literal prefix and the suffix must match the URL_PATH regex.
3. Only `<INT>` and `<URL_PATH>` exist as templates in v1. Any other `<...>` token in the ACL is a parse error (refuses daemon startup).
4. After argv match, every request token is **independently** checked against the metachar blacklist (§5.5). Defence in depth: even if `<URL_PATH>` regex has a bug, metachars are rejected separately.

This is deliberately restrictive. Operators wanting parameterised behaviour beyond `<INT>` / `<URL_PATH>` should write a wrapper script on the remote and allowlist *that* script, not extend the templating.

### 6.3 Environment scrubbing

The child process inherits a fixed minimal environment:

```text
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOME=<daemon's home>
LANG=C.UTF-8
LC_ALL=C.UTF-8
```

No other env vars from the daemon's environment leak through. The request cannot supply env vars in v1.

## 7. Hard cap behaviour

| Cap | Behaviour |
|---|---|
| `max_stdout_bytes` | Stop forwarding once total reaches cap. Send `Warning { kind: StdoutCapHit }`. Continue counting `stdout_bytes_total` for the final `Exit` frame, but discard data. Process keeps running. |
| `max_stderr_bytes` | Symmetric to stdout. |
| `max_stdin_bytes` | Request denied with `StdinTooLarge` before spawn. |
| `max_duration_secs` | At T=cap-5s, send SIGTERM. At T=cap, send SIGKILL. Send `Exit` with `signal: Some(SIGKILL)`. |
| `max_concurrent_per_agent` | Request denied with `ConcurrencyLimitReached`. |
| `max_concurrent_total` | Same. |
| `warn_stdout_bytes` | At first crossing, emit `Warning { kind: StdoutApproachingCap }`. Increment a counter visible in `/diagnostics/exec`. |
| `warn_duration_secs` | Same shape. |

Cap counters and last-N warning events are exposed at `GET /diagnostics/exec` (§9) so an operator interrogating a remote node sees a steady stream of these even if the requester ignores `Warning` frames.

## 8. Cancellation

Three triggers, one path:

1. **Client sends `ExecFrame::Cancel`** — the local CLI emits this on Ctrl-C, on `x0x exec --cancel <id>`, or on local timeout.
2. **Local x0xd loses its API client** — the HTTP/SSE/WebSocket connection from the CLI to local x0xd drops. Local x0xd auto-emits `Cancel` for every in-flight request from that client.
3. **Lease expiry** — the local daemon sends `LeaseRenew` while the API caller is connected. If renewals stop for the lease window, the remote session is cancelled. This covers gossip-DM delivery paths where no stable direct QUIC lifecycle event is visible.

On cancel, server: SIGTERM the child, wait 5 s, SIGKILL if still alive, send `Exit` (which the requester may not see if the channel is dead; that's fine — the request_id is locally garbage-collected after a 60 s grace period).

`tokio::process::Child::kill_on_drop(true)` is set as a final backstop in case the session task panics.

## 9. Diagnostics

`GET /diagnostics/exec` (mirrors `/diagnostics/dm`, `/diagnostics/gossip`). Bearer-token-protected. Returns:

```json
{
  "enabled": true,
  "active_sessions": 2,
  "active_per_agent": { "abc123...": 2 },
  "totals": {
    "requests_received": 18421,
    "requests_allowed": 18402,
    "requests_denied": 19,
    "denial_breakdown": {
      "agent_machine_not_in_acl": 4,
      "argv_not_allowed": 11,
      "shell_metachar_in_argv": 1,
      "concurrency_limit_reached": 3
    },
    "cap_breaches": {
      "stdout": 2,
      "stderr": 0,
      "duration": 1
    },
    "cap_warnings": {
      "stdout_approaching": 7,
      "duration_approaching": 14
    }
  },
  "recent_warnings": [
    {
      "ts": "2026-05-01T14:22:11Z",
      "kind": "StdoutApproachingCap",
      "agent_id": "abc123...",
      "request_id": "...",
      "argv_summary": "journalctl -u x0xd -n 999999",
      "bytes_at_warn": 8_400_000
    }
  ],
  "acl_summary": {
    "loaded_from": "/etc/x0x/exec-acl.toml",
    "loaded_at": "2026-05-01T09:00:00Z",
    "allow_entry_count": 4,
    "command_entry_count": 17
  }
}
```

The full ACL contents — agent IDs, machine IDs, full argvs — are deliberately *not* in the diagnostics output. They live in the file the operator owns.

CLI: `x0x diagnostics exec`.

## 10. Audit trail

Two layers, both required:

### 10.1 Local file audit

Append-only JSONL at `audit_log_path`. One line per event. Events:

```json
{"ts":"...","event":"request","request_id":"...","agent_id":"...","machine_id":"...","argv":["..."],"matched_acl":"ops-laptop@admin (David)","stdin_bytes":0,"timeout_ms":30000}
{"ts":"...","event":"started","request_id":"...","pid":12345}
{"ts":"...","event":"warning","request_id":"...","kind":"StdoutApproachingCap","bytes":8400000}
{"ts":"...","event":"exit","request_id":"...","code":0,"signal":null,"duration_ms":423,"stdout_bytes":1024,"stderr_bytes":0,"truncated":false}
{"ts":"...","event":"denial","request_id":"...","agent_id":"...","machine_id":"...","argv":["..."],"reason":"ArgvNotAllowed"}
```

File is opened with `O_APPEND`, fsynced on each entry. Operator-rotatable via standard `logrotate`. The daemon does not rotate.

### 10.2 CRDT TaskList audit (v1.1 waiver)

Tier 1 makes the local JSONL file the authoritative audit. The ACL field `audit_tasklist_id` is parsed and retained so deployed configs will not need another schema change, but the CRDT TaskList mirror is explicitly waived from v1 acceptance and deferred to v1.1.

Planned `TaskItem` schema for the v1.1 mirror:

```text
title: "exec [allowed|denied] <argv_summary> ← <agent_short>@<machine_short>"
state: Done (denials never enter Empty/Claimed; they are born Done)
metadata (LWW-Register):
  request_id: ULID
  ts_iso: "..."
  argv: ["..."]
  exit_code: i32 | null
  duration_ms: u64
  stdout_bytes: u64
  stderr_bytes: u64
  truncated: bool
  denial_reason: string | null
  warnings: ["StdoutApproachingCap", ...]
```

When implemented, the TaskList will provide a queryable, replicated, signed audit timeline. If the configured TaskList does not exist or is unreachable, exec will keep working — the local file remains authoritative and the CRDT mirror is best-effort. A future `audit_tasklist_unreachable` counter should show up in `/diagnostics/exec`.

For v1, operators must use `audit_log_path` and `/diagnostics/exec`; `audit_tasklist_id` has no runtime effect beyond being exposed through config parsing.

## 11. CLI

```bash
# Synchronous one-shot. Stdout/stderr stream to local terminal.
# Exit code = remote exit code (or 255 on transport / denial / cap kill).
x0x exec <agent> -- <argv...>

# With timeout (clamped to remote max_duration_secs).
x0x exec <agent> --timeout 30 -- journalctl -u x0xd -n 100

# With stdin from a file.
x0x exec <agent> --stdin-file payload.bin -- some-tool --consume-stdin

# Cancel an in-flight request.
x0x exec <agent> --cancel <request_id>

# List local in-flight requests (this client only).
x0x exec sessions

# Server-side diagnostics on a remote machine.
x0x diagnostics exec                 # local daemon's exec stats
x0x exec <agent> -- x0x diagnostics exec   # remote daemon's exec stats
                                          # (requires that argv to be allowlisted)
```

Agent identifier accepts: full hex `agent_id`, contact short-name (existing contact-store lookup), or VPS host short-name (e.g. `saorsa-7` → resolved via existing contact-name mapping).

## 12. REST API

| Method | Path | CLI | Notes |
|---|---|---|---|
| POST | `/exec/run` | `x0x exec` | Body: `{ agent_id, argv, stdin_b64?, timeout_ms?, cwd? }`. Current implementation blocks and returns the aggregated result `{ code, signal, stdout_b64, stderr_b64, duration_ms, denial_reason, ... }`; SSE streaming can be added later without changing the wire frames. |
| POST | `/exec/cancel` | `x0x exec --cancel` | Body: `{ request_id }`. |
| GET | `/exec/sessions` | `x0x exec sessions` | List sessions originated by this local daemon. |
| GET | `/diagnostics/exec` | `x0x diagnostics exec` | Server-side diagnostics. |

All four are added to `src/api/mod.rs` endpoint registry. `x0x routes` reflects them automatically. API manifest version bumps; total endpoint count rises from 124 to 128.

## 13. Daemon flags & config

```text
x0xd --exec-acl <PATH>     Override default /etc/x0x/exec-acl.toml.
                           Intended for tests; production should use the default.
```

No new config knobs — exec is gated entirely by the ACL file's presence and `[exec].enabled`.

`x0xd --check` is extended to validate the ACL file (parse, schema, agent/machine ID hex format) without starting the daemon.

`x0xd --doctor` reports exec status: enabled/disabled, ACL path, allow-entry count.

## 14. Files to create

```text
src/exec/mod.rs                    # public API: run_remote_exec(...) on Agent
src/exec/protocol.rs               # ExecFrame enum, bincode coding
src/exec/acl.rs                    # TOML schema, parser, argv matcher, fail-closed loader
src/exec/service.rs                # server: dispatcher, spawn, stream, cap enforcement
src/exec/client.rs                 # client: send Request, collect frames, surface to caller
src/exec/audit.rs                  # local JSONL log + optional CRDT TaskList bridge
src/exec/diagnostics.rs            # ExecDiagnostics counter struct + recent_warnings ring buffer
src/api/exec_handlers.rs           # POST /exec/run (sync + SSE), /exec/cancel, /exec/sessions, /diagnostics/exec
src/cli/commands/exec.rs           # x0x exec, x0x exec --cancel, x0x exec sessions
                                   # x0x diagnostics exec
tests/exec_acl_unit.rs             # ACL parsing, argv matching (positive + negative)
tests/exec_caps_unit.rs            # cap-enforcement state machine
tests/exec_integration.rs          # local 2-daemon end-to-end, denial paths, cancel
tests/e2e_exec.sh                  # local 3-daemon: alice→bob allowed, charlie→bob denied
docs/exec.md                       # operator guide: deploying ACL, examples, troubleshooting
```

## 15. Files to modify

```text
src/dm.rs                          # add DmKind::Exec variant; route to exec::service
src/api/mod.rs                     # register 4 new endpoints in EndpointRegistry
src/cli/mod.rs                     # register exec subcommand
src/lib.rs                         # re-export Agent::run_remote_exec
src/bin/x0xd.rs                    # parse --exec-acl flag; load ACL on startup;
                                   # fail-closed if malformed; wire ExecService
docs/design/api-manifest.json      # add 4 endpoints; bump endpoint count
CLAUDE.md                          # mention /etc/x0x/exec-acl.toml + 4 endpoints
TEST_SUITE_GUIDE.md                # document e2e_exec.sh
```

Optional follow-up (separate PR after exec lands):

```text
tests/e2e_vps.sh                   # convert SSH-based curl probes to x0x exec
                                   # to demonstrate the testing-ergonomics win
```

## 16. Test plan

Every PR landing this feature must include all of:

- **Unit, ACL** (`exec_acl_unit.rs`):
  - well-formed file → loads
  - missing required field → daemon refuses to start
  - invalid hex in agent_id/machine_id → refuses to start
  - unknown `<TEMPLATE>` token → refuses to start
  - exact-match argv accepts only exact match
  - `<INT>` accepts `1`, `12345`, rejects `0`, `-1`, `1e5`, `1.0`, ``5`a``
  - `<URL_PATH>` accepts `/health`, `/foo/bar`, rejects `/..`, `/foo bar`, `/foo;ls`
  - shell metachar in any token rejects regardless of allowlist
  - length mismatch rejects

- **Unit, caps** (`exec_caps_unit.rs`):
  - stdout cap → truncates, emits `Warning`, sets `truncated: true`, process unkilled
  - duration cap → SIGTERM sent at T-5s, SIGKILL at T, `Exit { signal: Some(SIGKILL) }`
  - concurrent cap → 5th request from same agent denied with `ConcurrencyLimitReached`
  - warn threshold crossed once → counter increments by 1, even if multiple Stdout chunks straddle it

- **Integration** (`exec_integration.rs`):
  - alice → bob allowed argv: stdout/stderr captured, exit_code matches, audit log line written
  - alice → bob denied argv: `Exit { denial_reason: Some(ArgvNotAllowed) }`, no process spawned
  - alice → charlie denied (no ACL entry): `AgentMachineNotInAcl`
  - bob client disconnect mid-exec: bob's child receives SIGTERM within 5s
  - alice cancel: child SIGTERMed, `Exit` frame received with `signal: Some(SIGTERM)`
  - audit TaskList: configured ID receives a TaskItem on every event; alice (the requester) sees it via existing CRDT sync

- **E2E** (`tests/e2e_exec.sh`, local, no VPS):
  - 3-daemon (alice, bob, charlie). Bob has ACL allowing alice for `["echo", "<INT>"]` and `["printenv", "PATH"]`.
  - `x0x exec bob -- echo 42` → stdout `42`, exit 0
  - `x0x exec bob -- echo hello` → denied (template mismatch)
  - `x0x exec bob -- echo 99 ; ls` → denied (metachar in token, even though we don't shell)
  - `x0x exec bob -- rm -rf /` → denied (not in allowlist)
  - `x0x exec bob -- printenv PATH` → returns `/usr/local/sbin:...` (env scrub verified)
  - charlie tries same allowed argv → denied (`AgentMachineNotInAcl`)
  - `x0x diagnostics exec` on bob shows the right counter increments

- **Build-validator**: 0 warnings, fmt clean, clippy `-D warnings` clean.

- **Test-runner**: full nextest suite remains 100 % passing; new tests add ≥ 30 assertions.

## 17. Acceptance criteria

Tier 1 ships when **all** of these hold:

1. `cargo nextest run --all-features --workspace` green, with the new tests counted.
2. `tests/e2e_exec.sh` green on a local mesh, two consecutive runs.
3. `x0xd --check --exec-acl /tmp/bad.toml` exits non-zero with a specific parse error for each of: missing required field, invalid hex, unknown template token.
4. `x0xd` with no `/etc/x0x/exec-acl.toml` starts cleanly with `enabled=false` reflected in `/diagnostics/exec`.
5. A request that breaches `max_stdout_bytes` is reflected in `/diagnostics/exec` `cap_breaches.stdout` counter on the *remote* node.
6. Killing the local CLI mid-exec causes the remote child process to receive SIGTERM within 5 s (verified by inspecting `ps` on the remote).
7. CRDT TaskList audit mirroring is either implemented or explicitly waived to v1.1. For v1 this document records the waiver and JSONL remains authoritative.
8. `docs/exec.md` exists with a working "deploy this ACL on saorsa-7, run this command from your laptop" example using real but redacted IDs.

## 18. Non-goals (explicit)

- **No PTY, no interactivity.** Use Tier 2 if it ever ships.
- **No environment variables in requests.** Add later if a concrete need appears.
- **No file upload.** Use existing `/files/send`.
- **No port forwarding.** Out of scope; possibly never in scope.
- **No PAM / OS password auth.** The trust boundary is the x0x identity + machine pin, not OS credentials.
- **No hot-reload of the ACL.** Restart the daemon. This is intentional — hot-reload of a security policy is a footgun.
- **No "any" wildcard in argv.** If an operator wants flexible commands, they wrap them in a script and allowlist that.
- **No regex in the ACL.** `<INT>` and `<URL_PATH>` are the only templates. New templates require a code change and a test.
- **No SSH replacement for end users.** This feature is for the testing fleet and the operator. End-user shell access is a separate product question.

## 19. Open questions (defer until after first PR lands)

- Is local-file JSONL audit + optional CRDT TaskList the right shape, or should the CRDT be mandatory? Land file-only first, see how operators use it.
- Should `Exit` frames carry the *first 1 KiB of stdout* even when the channel was healthy, so the audit log has self-contained context without a second roundtrip? Possibly yes. Needs a `first_stdout_preview` field; mark for v1.1.
- Does the CRDT audit TaskList need a separate MLS group per (requester ↔ remote) pair, or does the existing daemon-wide group suffice? Existing group is fine for v1; revisit if multiple operators share a remote.

## 20. Cross-references

- Existing precedent: `tests/runners/x0x_test_runner.py` is a primitive form of this. The systemd service file `tests/runners/x0x-test-runner.service` already lives on every VPS.
- Trust model: `docs/primers/trust.md`, `src/trust.rs::TrustEvaluator`.
- DM pipeline this rides on: `docs/design/dm-over-gossip.md`, `src/dm.rs`, `src/direct.rs`.
- Diagnostics pattern: `docs/diagnostics.md`, `src/api/handlers.rs::diagnostics_*`.
- Tier 2 deferred design: [`x0x-terminal.md`](x0x-terminal.md).