Expand description
Token-bucket rate budget sitting between crate::HttpClient and
the network.
§Why a CLI-side bucket exists at all
The engine has its own rate limiter; it will return 429 if the operator hammers an endpoint. A CLI-side bucket is not a redundancy — it is a courtesy layer that:
- Refuses visibly, not silently. An operator who types
/statustwenty times in five seconds must read a refusal line naming the budget, not watch the prompt freeze for a retry loop. The freeze is the classic mystery stall that erodes trust in a CLI; a typedrate: exhausted — retry in Nsis fixable. - Protects the engine’s own budget from blocking other operators. The engine’s bucket is per-operator; our CLI getting rate-limited by a local heuristic preserves headroom for Auto-mode + Telegram paths that operate without typing operators hammering them.
- Is an anchor for the status bar segment.
rate:N/Mpaints the current bucket fill in the always-visible tier; this module is the source of truth the widget reads.
§Determinism under test
The bucket’s time source is a Clock trait — production uses
SystemClock (thin wrapper over std::time::Instant), tests use
ManualClock and advance explicitly. Every budget assertion in
the test suite is wall-clock-free and therefore flake-free.
§Thread safety
RateBudget is Arc<Inner> where Inner holds a parking_lot:: Mutex<State>. The critical section is a handful of float math
ops; parking_lot’s uncontended lock is ~25 ns, well under the
network RTT the bucket guards. A sharded / atomic design would
buy nothing and cost legibility.
§What the bucket is not
- Not a global limiter. Multiple operators, multiple CLIs, even
multiple
HttpClients in the same process each hold their own bucket. Cross-process coordination would require IPC that buys less than it costs at M2 scale. - Not persistent. A CLI restart gets a fresh full bucket. An
operator who exhausted their budget and then restarted the CLI
could bypass the local refusal — but they would still be
subject to the engine’s own limiter, so the net effect is a
5-10 second delay + the engine’s 429 path running in lieu of
ours. The complexity of persisting to
~/.zero/state/rate.jsonis not worth that narrow hole.
Structs§
- Budget
Snapshot - Read-only view of the bucket state. Returned by
RateBudget::snapshotfor the doctor row and the status-bar widget — neither should hold the internal mutex. - Exhausted
- What
try_consumereturns when the bucket cannot satisfy the requested cost.retry_afteris how long (rounded up) until enough tokens will have accrued to complete the call. - Manual
Clock - Manual clock for tests. Wrap
Arc<ManualClock>; a test that wants to advance the clock holds a handle next to theRateBudgetit feeds into. - Rate
Budget - A cloneable, thread-safe token bucket. Cheap to clone (bumps an
Arc); the inner state is shared, which is the whole point. - System
Clock - Wall-clock implementation — the one production always uses.
Constants§
- DEFAULT_
CAPACITY - Default bucket capacity — the burst size. 60 tokens lets a busy
operator run ~20
/v2/statusrenders (cost 3 each) in a tight burst without hitting the floor, which covers the observed peak-typing pattern (rapid/status+/risk+/positionswalk at session open). - DEFAULT_
REFILL_ PER_ SECOND - Default refill rate — 1 token per second. 60 per minute
sustained matches the engine’s per-operator 429 floor observed
in the existing Python surface (see
engine/zero/auth.py’s_RATE_LIMITconstant, ~60 reqs/min). Staying under it means the CLI-side bucket trips before the engine’s ever does, guaranteeing the operator sees a typed refusal rather than a blanket 429.
Traits§
- Clock
- Wall-clock abstraction so the bucket can be exercised in tests
without sleeping. Only
now()is on the trait surface; every internal math op is pure and does not need further mocking.
Functions§
- cost_of
- Per-endpoint cost table. The costs themselves live here as
constso tests, docs, the status bar widget, and the budget itself all read from a single source — otherwise a change to the cost of/v2/statusin the client and a stale cost in the doctor row is a silent drift waiting to happen.