secure-exec-sidecar 0.3.1

Native Secure Exec sidecar runtime
Documentation
See `../CLAUDE.md` for crate-wide runtime and testing rules.

## env vs BARE wire — channel classification

Spawned language-host engines are configured two ways: the **BARE wire/structured request** (`protocol.rs` payloads, `CreateVmConfig`, and the per-engine `Start{Javascript,Wasm,Python}ExecutionRequest` structs in `crates/execution`) and the **`AGENTOS_*` env channel** (assembled in `prepare_guest_runtime_env` / `apply_wasm_limit_env` in `src/execution.rs`, read back by the engines/bridge). Every setting belongs to exactly one of three buckets:

1. **Process-wide / host / build / test → env.** Shared across all VMs, not per-VM configurable (e.g. `SECURE_EXEC_NODE`, `*_V8_BRIDGE_BUILD_SCRIPT`, pyodide index/cache URLs, all test/debug knobs). Leave on env.
2. **Per-VM bootstrap-before-wire → env (explicit carve-out).** Must exist at `exec` time *before* the wire/sync-RPC bridge is up: `AGENTOS_SANDBOX_ROOT`, the sync-RPC bridge fds (`AGENTOS_NODE_SYNC_RPC_ENABLE`/`_REQUEST_FD`/`_RESPONSE_FD`/`_DATA_BYTES`/`_WAIT_TIMEOUT_MS`), entrypoint/argv/payload (`AGENTOS_ENTRYPOINT`/`_GUEST_ENTRYPOINT`/`_GUEST_ARGV`/`_BOOTSTRAP_MODULE`, `_PYTHON_CODE`/`_PYTHON_FILE`, `_WASM_MODULE_PATH`). Keep on env; keep scrubbed from guest `process.env` and from `child_process` spawns.
3. **Per-VM runtime config → BARE wire.** Anything per-VM the established wire/bridge could carry: resource limits, isolation policy, virtualized identity. MUST ride the wire (a typed field on `CreateVmConfig`/`VmLimits` or the per-execution request), read by the engine from that field. New per-VM settings default here.

**Dead-cap anti-pattern.** A value set on the wire that is then silently re-emitted as an `AGENTOS_*` env knob is the dead-cap failure mode — the env knob can be wrong, stale, or never read while the wire value is ignored (this is exactly how `AGENTOS_WASM_MAX_STACK_BYTES` was set into env but never read). If it's on the wire, the engine reads it from the wire path; do not duplicate it onto env "just in case", and per the versionless-lockstep rule do not keep an env knob as a fallback.

Migration status: **resource limits** (typed `*ExecutionLimits` on the execution request) and **virtualized identity** (`process.{pid,ppid,uid,gid}` interpolated into the runtime shim; `os.{cpus,totalmem,freemem,homedir,userInfo,…}` via the `__agentOSVirtualOs` global the shim sets) are migrated to the wire. **Isolation policy** (`AGENTOS_GUEST_PATH_MAPPINGS`, `EXTRA_FS_*_PATHS`, `ALLOWED_NODE_BUILTINS`, `LOOPBACK_EXEMPT_PORTS`, `WASM_PERMISSION_TIER`) is bucket 3 but still on env — future work. Note: the guest module loader and `os` module (`guestOs`) read their inputs at module-evaluation; the runtime shim sets `__agentOSVirtualOs` early enough that this works (verified by `os_resource_limits_are_vm_scoped` in `tests/builtin_conformance.rs`, which asserts the guest's `os.*` identity reflects the configured VM).

## Local Patterns

- `RequestPayload::Ext`, `ResponsePayload::ExtResult`, `EventPayload::Ext`, and sidecar callback `Ext` payloads are opaque to core sidecar code; dispatch only by namespace and leave inner payload decoding to the registered extension.
- `ExtensionContext` primitives should delegate to existing `NativeSidecar` ownership, process, event, and callback paths instead of giving extensions direct access to internal maps such as VM tables or ACP session state.
- Extension callbacks and events must stay transport-agnostic: do not expose stdio, socket, or browser `postMessage` details through the `Extension` trait or `ExtensionContext`.
- Stdio blocking-request interruption must stay extension-owned. Core stdio may call generic `Extension` hooks, but production secure-exec-sidecar code must not decode ACP payloads or depend on `agentos-protocol`.
- Sidecar-to-host callback protocol must stay agent-agnostic: use `HostCallback{callback_key}` for generic host callbacks, and keep toolkit-specific naming and schemas out of the core callback frame.
- Legacy ACP helpers under `tests/acp_legacy/` are fixtures only; production ACP behavior belongs in `crates/agentos-sidecar`, not `crates/sidecar/src`.
- Tool CLI `--json` and `--json-file` payloads in `src/tools.rs` must be validated against the registered host callback `input_schema` before building `HostCallbackRequest`; relying on the host callback to fail closed leaves non-TypeScript hosts and any pre-dispatch checks exposed to raw, unvalidated payload shapes.
- `net.poll` waits in `src/execution.rs` must stay explicitly bounded. The sync-RPC handler runs on the sidecar's main sync-RPC thread, so guest `wait_ms` values must be clamped via `clamp_javascript_net_poll_wait(...)` to the 50 ms ceiling; longer waits should return the currently observed socket state after the ceiling expires instead of blocking dispose/shutdown or unrelated VM work.
- `kill_process` signal parsing in `src/execution.rs` must stay aligned with the guest `child_process.kill(...)` bridge contract: accept the full 1..31 signal table plus common aliases (`SIGIOT` -> `SIGABRT`, `SIGPOLL` -> `SIGIO`), and terminate shared-V8 child executions directly for non-streamed signals so child polls still observe prompt exits.
- `child_process.poll` in `src/execution.rs` must reserve `Value::Null` for the real "no events pending" case. If a tracked child disappears after descendant-event drain, return a typed `ECHILD` execution error instead so guest poll loops stop instead of spinning on a ghost child.
- Child stdin plumbing in `src/execution.rs` must mirror the root-process path for nested `child_process` children too: always call `child.execution.write_stdin()` / `close_stdin()` and the kernel pipe helpers together. Kernel-only writes or closes leave shared-V8 WASM children stuck behind the local `kernel_stdin` bridge, so pipelines like `echo hello | wc -c` never observe EOF.
- Child JavaScript executions use `service_javascript_sync_rpc(...)` in `src/execution.rs`, not the top-level `src/service.rs`, for `process.*` bridge calls. When changing guest self-signal behavior (`process.kill`, `process.abort()`, signal-shape exit reporting), mirror the top-level bookkeeping there too or spawned Node children will regress to `unsupported JavaScript sync RPC method process.kill` and report plain exit codes.
- Nested JavaScript child signal registration currently arrives through the sync-RPC method `process.signal_state`, not just `ActiveExecutionEvent::SignalState`. When fixing descendant `SIGCHLD` or job-control behavior in `src/execution.rs`, keep the nested `process.signal_state` bookkeeping aligned with the top-level `src/service.rs` handler or grandchildren will silently lose their registered signal handlers.
- The sidecar protocol `Authenticate` handshake must carry `secure_exec_bridge::bridge_contract().version`; `src/service.rs` should reject mismatches with `bridge_version_mismatch` before opening a connection so bridge-contract drift fails fast instead of crashing later on the first divergent RPC.
- `plugins/host_dir.rs` metadata writes must keep the old symlink-leaf safety contract for plain ops (`chmod`/`chown`/`utimes` still reject symlink leaves), while the richer timestamp path should only mutate symlink metadata when the caller explicitly requests nofollow semantics (`lutimes` / `utimes_spec(..., false)`).
- Mounted filesystem shutdown now happens explicitly during `src/vm.rs` disposal/reconfigure, not just in `MountTable::drop`. Bridge-backed mounts can therefore emit `SidecarRequestPayload::JsBridgeCall` during teardown, and host-visible flush failures should surface as the structured event `filesystem.mount.shutdown_failed` with the mount metadata plus the original error code/message.
- Guest runtime env setup in `src/execution.rs` must add writable `host_dir` / `module_access` mount roots to `AGENTOS_EXTRA_FS_WRITE_PATHS`, not just the VM shadow cwd. Without those extra write roots, guest `fs.*` bridge calls misclassify writable mounts as read-only (`EROFS`) and cross-mount rename tests never reach the intended `EXDEV` path.
- Python mapped host-path access in `src/filesystem.rs` must stay on the anchored-fd path: open the mapped root once, resolve descendants with `openat2(RESOLVE_BENEATH | RESOLVE_NO_MAGICLINKS)`, and perform the actual syscall through `/proc/self/fd/<fd>` or an anchored parent dir. Do not reintroduce resolve-then-use `PathBuf` opens, and when filtering `read_dir` results from a proc-fd directory, rebuild child host paths from the resolved directory host path plus `file_name()` instead of reusing `DirEntry::path()`, which points back into procfs.
- In `tests/builtin_conformance.rs`, isolated extra tests that open a host listener should finish sidecar/session/VM setup before starting any accept deadline. Those tests run in subprocesses that can queue behind the shared sidecar-runtime lock, so a server timeout that starts before VM creation will flake under the normal parallel `cargo test` harness.