# cellos-telemetry
The in-guest CellOS telemetry agent. Runs as PID 2 inside every
Firecracker microVM, declares process/network/capability events to the
host supervisor over vsock, and holds no signing key.
## What it is
`cellos-telemetry` is the runner-evidence wedge of ADR-0006: the
process the host trusts to *describe* what is happening inside a guest,
because the channel it speaks on (a vsock CID:port the supervisor
bound before the workload existed) is the authenticity primitive — not
a signature on the payload.
Layer L2 (host runtime / isolation) — guest-side. Forked by
[`cellos-init`](../cellos-init/README.md) as PID 2 *before* the
workload process (PID 3+) starts. The workload's seccomp profile
blocks `kill(2)`, `tgkill(2)`, and `ptrace(2)` against PIDs ≤ 2; the
agent is structurally unreachable from the workload.
What it isn't:
- **Not the host-side receiver.** `cellos-host-telemetry` (separate
crate) is the host-resident CBOR-over-vsock listener that host-stamps
arriving frames with `cell_id`, `run_id`, `host_received_at`, and
the ADG `output` block. This crate produces only the guest fields.
- **Not a signer.** The crate's `Cargo.toml` carries a hard-coded
DENY list (`ring`, `ed25519-dalek`, `hmac`, `rustls`, `webpki`,
`sha2-as-mac`) enforced by `cargo-deny` in CI. ADR-0006 §5 — Claim 5a:
the guest agent never holds a key. Dependencies are `libc` + value
types from `cellos-core`. A compromised guest could sign anything;
signature-in-guest is theatre. The host trusts the channel, not the
payload.
- **Not a kernel tracer.** Probes are `/proc` deltas, one declared
`inotify` watch, and (stubbed) connect/capability surfaces — no
eBPF, no kprobes, no kernel modules.
## Public API surface
### Crate-level constants
| `WIRE_CONTENT_VERSION_MAJOR` | `1` | CBOR wire-format major. Must match `cellos_host_telemetry::WIRE_CONTENT_VERSION_MAJOR` or the host rejects the frame. |
| `VSOCK_TELEMETRY_PORT` | `9001` | Well-known guest→host vsock port. |
| `VMADDR_CID_HOST` | `2` | Host CID per the AF_VSOCK ABI. |
| `MAX_FRAME_BODY_BYTES` | `4096` | Per-frame body cap. |
### Probe identifiers (`probes::*`)
- `process.spawned`, `process.exited` (`/proc` delta walker — implemented)
- `capability.denied` (stub — kernel surface not yet wired)
- `fs.inotify_fired` — declared `inotify` watch
- `net.connect_attempted` (stub)
`probes::ALL` enumerates every known probe id; `probes::is_known(s)`
gates declaration acceptance.
### Wire types
- `GuestTelemetryDeclaration` — the agent's declared probe surface,
projected from `cellos_core::DeclaredAuthoritySurface` so the host
can subset-check `declared ⊆ authorized` at admission (F3
admission-path prep, 2026-05-16).
- `ProbeEvent` — a single emitted event (guest fields only:
`probe_source`, `guest_pid`, `guest_comm`, `guest_monotonic_ns`,
`content_version`).
- `WireError` — decode failure reasons.
### CBOR codec (`src/lib.rs`)
- `encode_event_body` / `decode_event_body` — bare CBOR map(5) body.
- `encode_frame` / `decode_frame` — `u32 LE length || body` framing.
The codec is hand-rolled and minimal: definite-length `map(5)`, uint
major (0), text major (3). No floats, no tags, no indefinite lengths.
`content_version` is always emitted first so the host can short-circuit
unknown majors before walking unknown probe-source strings.
### Probe modules
- `probes::process` — `/proc` delta walker for spawn/exit.
- `probes::inotify` — one declared `inotify` watch.
- `probes::capability` — capability-denied stub.
- `probes::net_connect` — connect-attempted stub.
The CBOR + framing core is pure-safe Rust (`#![deny(unsafe_code)]`).
Syscall surfaces under `probes/` opt out per-module — `libc::inotify_init1`,
`socket`, `connect`, `fork` have no safe wrapper at this layer.
## Architecture
The wire shape from `src/lib.rs`:
```text
"probe_source" => text
"guest_pid" => u32
"guest_comm" => text
"guest_monotonic_ns" => u64
}
```
The agent fills only those five fields. The supervisor host-stamps
`cell_id`, `run_id`, `host_received_at`, `spec_signature_hash`, and
the ADG `output` block on receipt; anything the agent puts in those
fields is overwritten. That asymmetry is intentional — it is what
makes the channel the authenticity primitive.
**Back-pressure (ADR-0006 §5.3): drop-with-counter.** The agent
surfaces drops via the `cell.observability.guest.telemetry.dropped`
counter, never by blocking the workload. The workload's progress is
never coupled to the agent's I/O.
## Configuration
No env vars and no config file. The agent is parameterised entirely
by what `cellos-init` and the supervisor put on the kernel cmdline +
the bound vsock channel. The `VSOCK_TELEMETRY_PORT` is a compile-time
constant; the host binds the matching `(CID, port)` before the workload
runs.
## Examples
The agent is not invoked directly; it is forked by `cellos-init`. From
inside a supervisor integration test, the host-side counterpart looks
like:
```rust
// Pseudo-code — actual host receiver lives in cellos-host-telemetry.
use cellos_telemetry::{decode_frame, VSOCK_TELEMETRY_PORT};
let mut buf = [0u8; 4 + cellos_telemetry::MAX_FRAME_BODY_BYTES];
let n = vsock_listener.accept(VSOCK_TELEMETRY_PORT, cell_cid, &mut buf)?;
let event = decode_frame(&buf[..n])?;
// host-stamp cell_id, run_id, host_received_at, ADG output …
```
## Testing
Unit tests live inline. Public-surface coverage focuses on the codec
round-trip and the probe-id surface:
```
cargo test -p cellos-telemetry
```
Integration tests for the in-guest agent live alongside
`cellos-init` and `cellos-host-firecracker` — the agent is exercised
end-to-end through a real (or mocked) vsock channel.
The "no signing primitive" claim is enforced at the workspace level by
`cargo-deny` (`deny.toml`): the `cellos-telemetry` musl target dep set
is asserted to be `libc + cellos-core value types only`. Session 19
ratified this; see ADR-0006 Claim 5a / Claim 7.
## Related crates
- [`cellos-init`](../cellos-init/README.md) — PID-1 init that forks
this agent as PID 2 before the workload starts.
- [`cellos-core`](../cellos-core/README.md) — `DeclaredAuthoritySurface`
+ value types projected into `GuestTelemetryDeclaration`.
- `cellos-host-telemetry` — the host-side receiver. It host-stamps
arriving frames and emits the typed CloudEvents downstream.
- [`cellos-host-firecracker`](../cellos-host-firecracker/) — binds the
per-cell vsock CID and the `VSOCK_TELEMETRY_PORT` channel before the
workload runs.
## ADRs
- [ADR-0006](../../docs/adr/0006-in-vm-observability-runner-evidence.md) —
in-VM observability as the runner-evidence wedge. This crate is the
concrete implementation. Key claims this crate enforces:
- **Claim 5** — no signing key in the guest; supervisor signs
outbound CloudEvents using its host-side key after host-stamping.
- **Claim 5a** — `cellos-telemetry` Cargo manifest forbids any
signing primitive; CI cargo-deny gate enforces.
- **§5.2** — agent runs as PID 2, forked by `cellos-init` before
the workload (PID 3+).
- **§5.3** — drop-with-counter back-pressure.
- **§12** — wire-schema versioning (`content_version` first).
- [ADR-0001](../../docs/adr/0001-rust-nats-jetstream-proprietary-host.md) —
the proprietary host backend decision that puts the guest-observation
surface inside the cell.