Available now (v0.4):
- Token-bucket throttling — smooth refill with burst headroom; lock-free accounting (one atomic compare-and-swap per acquire)
- Exact sliding-window-log — when you need no boundary burst at all, an exact alternative that composes everywhere the bucket does
- Wait, don't reject — the outbound default is
acquire().await, which paces the caller;try_acquire()is there when you need the non-blocking answer - Cost-aware acquisition —
acquire_with_cost(n)— not every request weighs one unit - Multi-dimensional limits — enforce req/min AND input-tokens/min AND output-tokens/min at once; the killer feature for LLM APIs
- Composition — hybrid (must pass all), per-key (independent state per tenant), and layered (global / per-key / per-endpoint) limiters, combined without the call site changing
- Bounded memory — per-key state is sharded and evicted (idle TTL + hard cap), so a flood of unique keys hits a ceiling instead of growing without limit
- Retry + backoff — constant / linear / exponential backoff with full, equal, or decorrelated jitter; a retry policy with per-error classification;
Retry-Afterparsed and honored - Circuit breaker — closed / open / half-open recovery; wraps any limiter and fails fast when open, without consuming it
- Queueing — a bounded, deadline-aware, priority queue with fair-across-keys scheduling and reject / drop-oldest / drop-lowest-priority overflow
On the roadmap:
- Adaptive throttling (v0.5) — AIMD and latency-based controllers that slow down when a downstream struggles, with no explicit signal
- Provider-aware (v0.6) — parse
x-ratelimit-*/retry-afterheaders and sync internal state - Runtime-agnostic (v0.8) — tokio today, with async-std and smol planned
Installation
[]
= "0.4"
# Optional features:
= { = "0.4", = ["circuit-breaker"] }
Quick start
Pace your outbound calls so you never overwhelm a downstream:
use Throttle;
async
Budget an LLM provider across several limits at once — requests, input tokens, and output tokens:
use Duration;
use ;
async
Throttle independently per tenant, with bounded memory:
use PerKey;
async
Stack scopes — an overall ceiling, a per-tenant share, and a per-endpoint cap:
use ;
async
Retry a flaky call with jittered backoff, honoring a server Retry-After:
use Duration;
use ;
async
Wrap a flaky downstream in a circuit breaker (needs the circuit-breaker feature):
use Duration;
use ;
async
Full runnable examples live in examples/:
Performance
Local criterion means (cargo bench --bench throttle_bench, Windows x86_64, Rust stable):
- Single-throttle
try_acquire(uncontended): ~27 ns — one atomic compare-and-swap - Per-key lookup, 10 000 live keys: ~70 ns — hash, shard read lock, map get, acquire
Where It Fits
throttle-net is the outbound resilience layer. It is used by:
rate-net— the inbound counterpart; throttle-net is outboundpack-io/network-protocol— clients that call rate-limited downstreams- AVA / agent-provider — LLM API budgeting with multi-dimensional token limits
- Hive DB — cluster RPC backpressure and downstream protection
It stays foreign-compatible: the obvious default for "I need to call an external API in Rust and not get banned."
Contributing
Before opening a PR, cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features must be clean. Hot-path changes require a criterion benchmark; correctness-critical paths require property and/or loom tests.