throttle-net 0.4.0

Available now (v0.4):

Token-bucket throttling — smooth refill with burst headroom; lock-free accounting (one atomic compare-and-swap per acquire)
Exact sliding-window-log — when you need no boundary burst at all, an exact alternative that composes everywhere the bucket does
Wait, don't reject — the outbound default is acquire().await, which paces the caller; try_acquire() is there when you need the non-blocking answer
Cost-aware acquisition — acquire_with_cost(n) — not every request weighs one unit
Multi-dimensional limits — enforce req/min AND input-tokens/min AND output-tokens/min at once; the killer feature for LLM APIs
Composition — hybrid (must pass all), per-key (independent state per tenant), and layered (global / per-key / per-endpoint) limiters, combined without the call site changing
Bounded memory — per-key state is sharded and evicted (idle TTL + hard cap), so a flood of unique keys hits a ceiling instead of growing without limit
Retry + backoff — constant / linear / exponential backoff with full, equal, or decorrelated jitter; a retry policy with per-error classification; Retry-After parsed and honored
Circuit breaker — closed / open / half-open recovery; wraps any limiter and fails fast when open, without consuming it
Queueing — a bounded, deadline-aware, priority queue with fair-across-keys scheduling and reject / drop-oldest / drop-lowest-priority overflow

On the roadmap:

Adaptive throttling (v0.5) — AIMD and latency-based controllers that slow down when a downstream struggles, with no explicit signal
Provider-aware (v0.6) — parse x-ratelimit-* / retry-after headers and sync internal state
Runtime-agnostic (v0.8) — tokio today, with async-std and smol planned

Installation

[dependencies]
throttle-net = "0.4"

# Optional features:
throttle-net = { version = "0.4", features = ["circuit-breaker"] }

Quick start

Pace your outbound calls so you never overwhelm a downstream:

use throttle_net::Throttle;

#[tokio::main]
async fn main() -> Result<(), throttle_net::ThrottleError> {
    // 100 requests per second, bursting up to 100.
    let throttle = Throttle::per_second(100);

    throttle.acquire().await?; // returns as soon as a token is free
    // ... call the downstream ...
    Ok(())
}

Budget an LLM provider across several limits at once — requests, input tokens, and output tokens:

use std::time::Duration;
use throttle_net::{MultiLimiter, Throttle};

#[tokio::main]
async fn main() -> Result<(), throttle_net::ThrottleError> {
    let minute = Duration::from_secs(60);
    let limiter = MultiLimiter::builder()
        .dimension("requests", Throttle::per_duration(60, minute))
        .dimension("input_tokens", Throttle::per_duration(100_000, minute))
        .dimension("output_tokens", Throttle::per_duration(20_000, minute))
        .build();

    // Admitted only when every budget can afford this call.
    limiter
        .acquire_costs(&[("requests", 1), ("input_tokens", 1500), ("output_tokens", 200)])
        .await?;
    Ok(())
}

Throttle independently per tenant, with bounded memory:

use throttle_net::PerKey;

#[tokio::main]
async fn main() -> Result<(), throttle_net::ThrottleError> {
    // 100 requests per second, per tenant.
    let limiter: PerKey<String> = PerKey::per_second(100);
    limiter.acquire(&"tenant:42".to_string()).await?;
    Ok(())
}

Stack scopes — an overall ceiling, a per-tenant share, and a per-endpoint cap:

use throttle_net::{Layered, PerKey, Throttle};

#[tokio::main]
async fn main() -> Result<(), throttle_net::ThrottleError> {
    let layered = Layered::<String>::builder()
        .global(Throttle::per_second(1000))
        .per_key(PerKey::per_second(100))
        .per_endpoint(PerKey::per_second(50))
        .build();

    layered
        .acquire(&"tenant:42".to_string(), &"/v1/chat".to_string())
        .await?;
    Ok(())
}

Retry a flaky call with jittered backoff, honoring a server Retry-After:

use std::time::Duration;
use throttle_net::{Backoff, Retry, RetryAction, parse_retry_after};

struct Rejected { retry_after: Option<String> }

#[tokio::main]
async fn main() {
    // Exponential from 100ms, doubling, capped at 5s, decorrelated jitter (the default).
    let retry = Retry::new(Backoff::default().with_max(Duration::from_secs(5))).max_attempts(5);

    let result: Result<&str, Rejected> = retry
        .run(
            || async { Err(Rejected { retry_after: None }) }, // your fallible call
            |err: &Rejected| match err.retry_after.as_deref().and_then(parse_retry_after) {
                Some(after) => RetryAction::RetryAfter(after), // honor the server's hint
                None => RetryAction::Retry,                    // else use the backoff
            },
        )
        .await;
    let _ = result;
}

Wrap a flaky downstream in a circuit breaker (needs the circuit-breaker feature):

use std::time::Duration;
use throttle_net::{CircuitBreaker, Throttle, Trip};

#[tokio::main]
async fn main() {
    let breaker = CircuitBreaker::builder()
        .trip(Trip::Consecutive(5))           // open after 5 failures in a row
        .cooldown(Duration::from_secs(10))
        .build(Throttle::per_second(100));

    match breaker.acquire().await {
        Ok(permit) => {
            // ... call the downstream ...
            let ok = true;
            if ok { permit.success() } else { permit.failure() }
        }
        Err(_shed) => { /* breaker open: fail fast */ }
    }
}

Full runnable examples live in examples/:

cargo run --example llm_budget                              # multi-dimensional LLM budgets
cargo run --example retry_backoff                           # retry with backoff + Retry-After
cargo run --example circuit_breaker --features circuit-breaker  # trip, shed, recover

Performance

Local criterion means (cargo bench --bench throttle_bench, Windows x86_64, Rust stable):

Single-throttle try_acquire (uncontended): ~27 ns — one atomic compare-and-swap
Per-key lookup, 10 000 live keys: ~70 ns — hash, shard read lock, map get, acquire

Where It Fits

throttle-net is the outbound resilience layer. It is used by:

rate-net — the inbound counterpart; throttle-net is outbound
pack-io / network-protocol — clients that call rate-limited downstreams
AVA / agent-provider — LLM API budgeting with multi-dimensional token limits
Hive DB — cluster RPC backpressure and downstream protection

It stays foreign-compatible: the obvious default for "I need to call an external API in Rust and not get banned."

Contributing

Before opening a PR, cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features must be clean. Hot-path changes require a criterion benchmark; correctness-critical paths require property and/or loom tests.