# varta-client
[](https://crates.io/crates/varta-client)
← [Workspace root](../../README.md)
Agent API — emit VLP frames over a Unix Domain Socket. One `connect` call
allocates a socket; every subsequent `beat` call is zero-allocation and
non-blocking.
## Quick start
```rust,no_run
use varta_client::{BeatOutcome, Status, Varta};
fn main() -> std::io::Result<()> {
let mut agent = Varta::connect("/tmp/varta.sock")?;
loop {
match agent.beat(Status::Ok, 0) {
BeatOutcome::Sent => {}
BeatOutcome::Dropped(_) => { /* observer absent or queue full — safe to ignore */ }
BeatOutcome::Failed(e) => eprintln!("beat error: {e}"),
}
std::thread::sleep(std::time::Duration::from_millis(500));
}
}
```
## API summary
### `Varta`
| `connect` | `(path: impl AsRef<Path>) -> io::Result<Varta>` | Open a non-blocking `UnixDatagram` to the observer. The only allocation point. |
| `beat` | `(&mut self, status: Status, payload: u64) -> BeatOutcome` | Emit one 32-byte VLP frame. Never blocks; never allocates. |
| `reconnect` | `(&mut self) -> io::Result<()>` | Re-bind the socket to the observer path (e.g. after an observer restart). |
| `set_reconnect_after` | `(&mut self, n: u32)` | Enable auto-reconnect after `n` consecutive `Dropped` outcomes. |
### `BeatOutcome`
| `Sent` | Kernel accepted the datagram. |
| `Dropped(DropReason)` | Datagram not delivered — treat as no-op. The `DropReason` identifies the underlying cause (see table below). |
| `Failed(BeatError)` | Unexpected I/O error; the inner error does not allocate. |
### `DropReason`
| `KernelQueueFull` | `WouldBlock`, `ENOBUFS` | Transient burst; observer is likely alive. Retry or rely on `set_reconnect_after`. |
| `NoObserver` | `NotFound`, `ConnectionRefused` | Observer not yet bound — expected during rolling restarts. |
| `PeerGone` | `ConnectionReset`, `NotConnected`, `BrokenPipe` | Channel was live and disappeared (crash or shutdown). Call `reconnect` to recover. |
| `StorageFull` | `StorageFull` | Host filesystem full; operator intervention required. |
### `Status`
| `Ok` | `0` | Healthy and making progress. |
| `Degraded` | `1` | Making progress with elevated trouble. |
| `Critical` | `2` | About to die; also emitted by the panic hook. |
| `Stall` | `3` | Synthesised by `varta-watch` on silence; agents do not send this. |
## Payload encoding
The 64-bit `payload` field is application-defined. A common convention is to
pack two `u32` values:
```rust,no_run
// high 32 bits = queue depth, low 32 bits = last error code
let payload = (queue_depth as u64) << 32 | (last_error_code as u64);
```
The observer carries the payload opaquely; decoding belongs to the agent and
any downstream tool that reads the exported metrics file.
## `panic-handler` feature flag
Enable the optional panic hook to emit a `Status::Critical` frame before
normal unwinding:
```toml
# Cargo.toml
[dependencies.varta-client]
path = "../varta-client"
features = ["panic-handler"]
```
```rust,no_run
// Call once at process start, before any other setup.
varta_client::install_panic_handler("/tmp/varta.sock");
```
The hook chains the previously installed hook (preserving the default panic
message and any user hooks). The sole heap allocation is the `Box` created by
`std::panic::set_hook` at install time; the hook closure itself is stack-only.
## Fork recovery & tracker semantics
`Varta` snapshots the calling process's PID at `connect()` time and compares
`std::process::id()` against the snapshot on every `beat()`. If they differ
— i.e. the process executing `beat()` is a forked child that inherited the
parent's `Varta` — the client transparently recovers:
1. `transport.reconnect()` runs (re-binds the underlying socket; on
secure-UDP, refreshes the IV salt from OS entropy so AEAD nonce uniqueness
is preserved across the fork boundary).
2. The per-connection counters (`nonce`, `start`, `last_timestamp`,
`consecutive_dropped`) reset, because the child's frame stream is
logically a new connection from the observer's perspective — every wire
field is keyed by `frame.pid`, which is now the child's PID.
3. The `fork_recoveries` counter increments. Surface it as
`varta_client_fork_recoveries_total` via `Varta::fork_recoveries()` if
you publish client-side telemetry.
Once recovered, the child's first beat goes into a fresh tracker slot on
the observer (different PID → different slot), so the child's frames never
race the parent's frames at the protocol level.
### The parent-pid stall window
The auto-recovery handles the *child*. The *parent* is harder: if the
parent process forks and then exits (a classic daemonise pattern), its PID
disappears from the kernel but the observer's tracker slot for that PID
keeps aging. After `--threshold-ms` it stalls; if recovery is configured
for kernel-attested origins, the observer may fire a recovery command for
a PID that no longer exists.
The fix is on the agent side, not the observer side. Two patterns work:
* **Preferred — emit a terminal frame before the parent exits.** Send one
last `Status::Critical` beat from the parent immediately before
`exit(0)`. The observer records the critical frame and treats subsequent
silence as expected. The panic hook does this for free with
`nonce == NONCE_TERMINAL`; for clean exits, hand-roll the call:
```rust,ignore
let _ = agent.beat(Status::Critical, 0); // "I am leaving"
std::process::exit(0);
```
* **Alternative — widen the threshold.** If the parent reliably exits
within a few hundred milliseconds of fork, set `--threshold-ms` on the
observer above that window so the parent's slot is collected (per
`EVICTION_MULTIPLIER × threshold_ns`) before recovery would fire.
The child's slot is never affected by this concern: it has a different PID,
its own slot, and its own monotonically resetting nonce stream. There is no
within-PID nonce collision because the IV salt + counter rotate on
secure-UDP and the plaintext transports do not key on continuity.
## Constraints
- **Zero production dependencies.** `[dependencies]` is empty (plus the
path dep on `varta-vlp`); no registry crate is pulled in.
- **Zero steady-state allocation.** After `Varta::connect`, `beat()` does not
touch the heap. Verified by a guard-allocator test in `varta-tests`.
- **Non-blocking.** The socket is set to non-blocking mode at `connect()` time; `WouldBlock` is treated as `Dropped(DropReason::KernelQueueFull)` — the caller never stalls.
## See also
- Protocol crate: [`crates/varta-vlp/README.md`](../varta-vlp/README.md)
- Examples: [`crates/varta-client/examples/`](examples/)
- Architecture: [`book/src/architecture/vlp-frame.md`](../../book/src/architecture/vlp-frame.md)
## Other languages
Official clients in non-Rust languages live under
[`clients/`](../../clients/). Today: Python
([`clients/python/`](../../clients/python/), `pip install varta`).
Each port preserves the same wire-level contract and is verified
against the same `tools/vlp-test-vectors.json` conformance suite as
this crate.