Skip to main content

Module rate_budget

Module rate_budget 

Source
Expand description

Token-bucket rate budget sitting between crate::HttpClient and the network.

§Why a CLI-side bucket exists at all

The engine has its own rate limiter; it will return 429 if the operator hammers an endpoint. A CLI-side bucket is not a redundancy — it is a courtesy layer that:

  1. Refuses visibly, not silently. An operator who types /status twenty times in five seconds must read a refusal line naming the budget, not watch the prompt freeze for a retry loop. The freeze is the classic mystery stall that erodes trust in a CLI; a typed rate: exhausted — retry in Ns is fixable.
  2. Protects the engine’s own budget from blocking other operators. The engine’s bucket is per-operator; our CLI getting rate-limited by a local heuristic preserves headroom for Auto-mode + Telegram paths that operate without typing operators hammering them.
  3. Is an anchor for the status bar segment. rate:N/M paints the current bucket fill in the always-visible tier; this module is the source of truth the widget reads.

§Determinism under test

The bucket’s time source is a Clock trait — production uses SystemClock (thin wrapper over std::time::Instant), tests use ManualClock and advance explicitly. Every budget assertion in the test suite is wall-clock-free and therefore flake-free.

§Thread safety

RateBudget is Arc<Inner> where Inner holds a parking_lot:: Mutex<State>. The critical section is a handful of float math ops; parking_lot’s uncontended lock is ~25 ns, well under the network RTT the bucket guards. A sharded / atomic design would buy nothing and cost legibility.

§What the bucket is not

  • Not a global limiter. Multiple operators, multiple CLIs, even multiple HttpClients in the same process each hold their own bucket. Cross-process coordination would require IPC that buys less than it costs at M2 scale.
  • Not persistent. A CLI restart gets a fresh full bucket. An operator who exhausted their budget and then restarted the CLI could bypass the local refusal — but they would still be subject to the engine’s own limiter, so the net effect is a 5-10 second delay + the engine’s 429 path running in lieu of ours. The complexity of persisting to ~/.zero/state/rate.json is not worth that narrow hole.

Structs§

BudgetSnapshot
Read-only view of the bucket state. Returned by RateBudget::snapshot for the doctor row and the status-bar widget — neither should hold the internal mutex.
Exhausted
What try_consume returns when the bucket cannot satisfy the requested cost. retry_after is how long (rounded up) until enough tokens will have accrued to complete the call.
ManualClock
Manual clock for tests. Wrap Arc<ManualClock>; a test that wants to advance the clock holds a handle next to the RateBudget it feeds into.
RateBudget
A cloneable, thread-safe token bucket. Cheap to clone (bumps an Arc); the inner state is shared, which is the whole point.
SystemClock
Wall-clock implementation — the one production always uses.

Constants§

DEFAULT_CAPACITY
Default bucket capacity — the burst size. 60 tokens lets a busy operator run ~20 /v2/status renders (cost 3 each) in a tight burst without hitting the floor, which covers the observed peak-typing pattern (rapid /status + /risk + /positions walk at session open).
DEFAULT_REFILL_PER_SECOND
Default refill rate — 1 token per second. 60 per minute sustained matches the engine’s per-operator 429 floor observed in the existing Python surface (see engine/zero/auth.py’s _RATE_LIMIT constant, ~60 reqs/min). Staying under it means the CLI-side bucket trips before the engine’s ever does, guaranteeing the operator sees a typed refusal rather than a blanket 429.

Traits§

Clock
Wall-clock abstraction so the bucket can be exercised in tests without sleeping. Only now() is on the trait surface; every internal math op is pure and does not need further mocking.

Functions§

cost_of
Per-endpoint cost table. The costs themselves live here as const so tests, docs, the status bar widget, and the budget itself all read from a single source — otherwise a change to the cost of /v2/status in the client and a stale cost in the doctor row is a silent drift waiting to happen.