varta-client
Agent API — emit VLP frames over a Unix Domain Socket. One connect call
allocates a socket; every subsequent beat call is zero-allocation and
non-blocking.
Quick start
use ;
API summary
Varta
| Method | Signature | Description |
|---|---|---|
connect |
(path: impl AsRef<Path>) -> io::Result<Varta> |
Open a non-blocking UnixDatagram to the observer. The only allocation point. |
beat |
(&mut self, status: Status, payload: u64) -> BeatOutcome |
Emit one 32-byte VLP frame. Never blocks; never allocates. |
reconnect |
(&mut self) -> io::Result<()> |
Re-bind the socket to the observer path (e.g. after an observer restart). |
set_reconnect_after |
(&mut self, n: u32) |
Enable auto-reconnect after n consecutive Dropped outcomes. |
BeatOutcome
| Variant | Meaning |
|---|---|
Sent |
Kernel accepted the datagram. |
Dropped(DropReason) |
Datagram not delivered — treat as no-op. The DropReason identifies the underlying cause (see table below). |
Failed(BeatError) |
Unexpected I/O error; the inner error does not allocate. |
DropReason
| Variant | Source errors | Interpretation |
|---|---|---|
KernelQueueFull |
WouldBlock, ENOBUFS |
Transient burst; observer is likely alive. Retry or rely on set_reconnect_after. |
NoObserver |
NotFound, ConnectionRefused |
Observer not yet bound — expected during rolling restarts. |
PeerGone |
ConnectionReset, NotConnected, BrokenPipe |
Channel was live and disappeared (crash or shutdown). Call reconnect to recover. |
StorageFull |
StorageFull |
Host filesystem full; operator intervention required. |
Status
| Variant | Wire value | Meaning |
|---|---|---|
Ok |
0 |
Healthy and making progress. |
Degraded |
1 |
Making progress with elevated trouble. |
Critical |
2 |
About to die; also emitted by the panic hook. |
Stall |
3 |
Synthesised by varta-watch on silence; agents do not send this. |
Payload encoding
The 64-bit payload field is application-defined. A common convention is to
pack two u32 values:
// high 32 bits = queue depth, low 32 bits = last error code
let payload = << 32 | ;
The observer carries the payload opaquely; decoding belongs to the agent and any downstream tool that reads the exported metrics file.
panic-handler feature flag
Enable the optional panic hook to emit a Status::Critical frame before
normal unwinding:
# Cargo.toml
[]
= "../varta-client"
= ["panic-handler"]
// Call once at process start, before any other setup.
install_panic_handler;
The hook chains the previously installed hook (preserving the default panic
message and any user hooks). The sole heap allocation is the Box created by
std::panic::set_hook at install time; the hook closure itself is stack-only.
Fork recovery & tracker semantics
Varta snapshots the calling process's PID at connect() time and compares
std::process::id() against the snapshot on every beat(). If they differ
— i.e. the process executing beat() is a forked child that inherited the
parent's Varta — the client transparently recovers:
transport.reconnect()runs (re-binds the underlying socket; on secure-UDP, refreshes the IV salt from OS entropy so AEAD nonce uniqueness is preserved across the fork boundary).- The per-connection counters (
nonce,start,last_timestamp,consecutive_dropped) reset, because the child's frame stream is logically a new connection from the observer's perspective — every wire field is keyed byframe.pid, which is now the child's PID. - The
fork_recoveriescounter increments. Surface it asvarta_client_fork_recoveries_totalviaVarta::fork_recoveries()if you publish client-side telemetry.
Once recovered, the child's first beat goes into a fresh tracker slot on the observer (different PID → different slot), so the child's frames never race the parent's frames at the protocol level.
The parent-pid stall window
The auto-recovery handles the child. The parent is harder: if the
parent process forks and then exits (a classic daemonise pattern), its PID
disappears from the kernel but the observer's tracker slot for that PID
keeps aging. After --threshold-ms it stalls; if recovery is configured
for kernel-attested origins, the observer may fire a recovery command for
a PID that no longer exists.
The fix is on the agent side, not the observer side. Two patterns work:
-
Preferred — emit a terminal frame before the parent exits. Send one last
Status::Criticalbeat from the parent immediately beforeexit(0). The observer records the critical frame and treats subsequent silence as expected. The panic hook does this for free withnonce == NONCE_TERMINAL; for clean exits, hand-roll the call:let _ = agent.beat; // "I am leaving" exit; -
Alternative — widen the threshold. If the parent reliably exits within a few hundred milliseconds of fork, set
--threshold-mson the observer above that window so the parent's slot is collected (perEVICTION_MULTIPLIER × threshold_ns) before recovery would fire.
The child's slot is never affected by this concern: it has a different PID, its own slot, and its own monotonically resetting nonce stream. There is no within-PID nonce collision because the IV salt + counter rotate on secure-UDP and the plaintext transports do not key on continuity.
Constraints
- Zero production dependencies.
[dependencies]is empty (plus the path dep onvarta-vlp); no registry crate is pulled in. - Zero steady-state allocation. After
Varta::connect,beat()does not touch the heap. Verified by a guard-allocator test invarta-tests. - Non-blocking. The socket is set to non-blocking mode at
connect()time;WouldBlockis treated asDropped(DropReason::KernelQueueFull)— the caller never stalls.
See also
- Protocol crate:
crates/varta-vlp/README.md - Examples:
crates/varta-client/examples/ - Architecture:
book/src/architecture/vlp-frame.md
Other languages
Official clients in non-Rust languages live under
clients/. Today: Python
(clients/python/, pip install varta).
Each port preserves the same wire-level contract and is verified
against the same tools/vlp-test-vectors.json conformance suite as
this crate.