Available now (v0.9):
- Token-bucket throttling — smooth refill with burst headroom; lock-free accounting (one atomic compare-and-swap per acquire)
- Exact sliding-window-log — when you need no boundary burst at all, an exact alternative that composes everywhere the bucket does
- Wait, don't reject — the outbound default is
acquire().await, which paces the caller;try_acquire()is there when you need the non-blocking answer - Cost-aware acquisition —
acquire_with_cost(n)— not every request weighs one unit - Multi-dimensional limits — enforce req/min AND input-tokens/min AND output-tokens/min at once; the killer feature for LLM APIs
- Composition — hybrid (must pass all), per-key (independent state per tenant), and layered (global / per-key / per-endpoint) limiters, combined without the call site changing
- Bounded memory — per-key state is sharded and evicted (idle TTL + hard cap), so a flood of unique keys hits a ceiling instead of growing without limit
- Retry + backoff — constant / linear / exponential backoff with full, equal, or decorrelated jitter; a retry policy with per-error classification;
Retry-Afterparsed and honored - Circuit breaker — closed / open / half-open recovery; wraps any limiter and fails fast when open, without consuming it
- Queueing — a bounded, deadline-aware, priority queue with fair-across-keys scheduling and reject / drop-oldest / drop-lowest-priority overflow
- Adaptive concurrency — AIMD and Vegas-style controllers that discover the right in-flight limit from outcome feedback, slowing down when a downstream struggles with no explicit signal, bounded by a floor and a hard ceiling
- Provider-aware — parse
x-ratelimit-*/retry-afterheaders from OpenAI, Anthropic, GitHub, Stripe, AWS, or the RFC draft; reconcile your limiter with the server's view; start from LLM tier presets - Observability — metrics (
metricscrate) and tracing events around every acquire and state transition, feature-gated and zero-cost when off - Runtime-agnostic — the waiting surface runs on either tokio or smol; the async code is the same, you pick the timer backend by feature (async-std is unsupported — it is discontinued, RUSTSEC-2025-0052)
no_stdcore — withstdoff, the pure algorithm types (Backoff,Jitter,Decision) compile without the standard library- Hardened — fuzzed parsers, a
loommodel check of the lock-free slot accounting, property tests for every limiter invariant, and comparative benchmarks againstgovernor
On the roadmap:
- 1.0 — first-consumer integration and final benchmarks, then the stable release. The public API is frozen as of v0.8.
Installation
[]
= "0.9"
# Optional features:
= { = "0.9", = ["circuit-breaker", "adaptive", "provider-llm", "metrics", "tracing"] }
# Run the waiting surface on smol instead of tokio:
= { = "0.9", = false, = ["smol"] }
# no_std algorithm core only (Backoff, Jitter, Decision):
= { = "0.9", = false }
Quick start
Pace your outbound calls so you never overwhelm a downstream:
use Throttle;
async
Budget an LLM provider across several limits at once — requests, input tokens, and output tokens:
use Duration;
use ;
async
Throttle independently per tenant, with bounded memory:
use PerKey;
async
Stack scopes — an overall ceiling, a per-tenant share, and a per-endpoint cap:
use ;
async
Retry a flaky call with jittered backoff, honoring a server Retry-After:
use Duration;
use ;
async
Wrap a flaky downstream in a circuit breaker (needs the circuit-breaker feature):
use Duration;
use ;
async
Stay in sync with a provider's own rate-limit headers, and start from a tier preset (needs the provider-llm feature):
use presets;
use HeaderProfile;
async
Full runnable examples live in examples/:
Documentation
- API reference — every public item, with parameters and multiple examples each
- Cookbook — task-oriented recipes for common problems
- Migrating from
governor— API mapping and before/after - CHANGELOG
Performance
Local criterion means (cargo bench --bench throttle_bench, Windows x86_64, Rust stable):
- Single-throttle
try_acquire(uncontended): ~27 ns — one atomic compare-and-swap - Per-key lookup, 10 000 live keys: ~70 ns — hash, shard read lock, map get, acquire
Where It Fits
throttle-net is the outbound resilience layer. It is used by:
rate-net— the inbound counterpart; throttle-net is outboundpack-io/network-protocol— clients that call rate-limited downstreams- AVA / agent-provider — LLM API budgeting with multi-dimensional token limits
- Hive DB — cluster RPC backpressure and downstream protection
It stays foreign-compatible: the obvious default for "I need to call an external API in Rust and not get banned."
Contributing
Before opening a PR, cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features must be clean. The runtime matrix must also build and test on smol (cargo test --no-default-features --features smol) and the no_std core must build (cargo build --no-default-features). Hot-path changes require a criterion benchmark; correctness-critical paths require property and/or loom tests.