Module rate_limit

Expand description

§Rate Limiter — Token Bucket for Outbound Call Throttling

A RateLimiter caps how fast tasks make outbound calls, smoothing bursty load into a steady rate a downstream dependency can absorb. Where a CircuitBreaker stops calls to a dependency that is down, a rate limiter paces calls to a dependency that is up but rate-sensitive (a third-party API with a quota, a database, an LLM endpoint billed per request).

§The algorithm: a token bucket

The limiter holds a bucket of fractional tokens. Each acquisition consumes one token; tokens replenish at a fixed tokens_per_period / refill_period rate, up to a maximum capacity. Refill is lazy: there is no background task. Every try_acquire reads a single Instant, adds elapsed × refill_per_sec tokens (capped at capacity), then decides. A workflow that never constructs a limiter pays nothing.

The bucket starts full, so a burst of up to capacity calls is admitted instantly; sustained traffic then settles to the refill rate. Capacity is max_tokens + burst — max_tokens is the steady-state ceiling and burst is extra headroom for short spikes.

§Acquiring

try_acquire — non-blocking; returns Some(Permit) if a token was available, None if the bucket is empty. Use it to shed load.
acquire — async; if the bucket is empty it computes exactly how long until the next token refills, tokio::time::sleeps that long, and retries. Use it to pace work.

A Permit is a lightweight RAII marker for the rate-limited call’s scope. Unlike a CircuitBreaker permit (which records a success/failure outcome) or a semaphore permit (which returns capacity on drop), a token-bucket permit’s drop is a no-op — the token was already spent at acquisition time and the bucket refills on the clock, not on release.

A limiter is cheap to clone (Arc inside). Share one Arc<RateLimiter> across every task that hits the same quota so the budget is enforced globally, including across tasks running in parallel inside a split/join state. Internally it’s a synchronous parking_lot::Mutex with no awaits held across the critical section.

RateLimiter also implements Resource (no-op lifecycle), so it can be registered in Resources and looked up by key inside a task body.

use std::sync::Arc;
use cano::prelude::*;

// 5 tokens/sec; the bucket starts full.
let limiter = Arc::new(RateLimiter::new(RateLimiterPolicy::per_second(5)));
let permit = limiter.try_acquire().expect("a fresh bucket has tokens");
drop(permit); // dropping a token-bucket permit is a no-op

Structs§

MeterStatus: A point-in-time view of a Meter’s capacity, for observability.
MultiPermit: A successful MultiRateLimiter acquisition, marking the scope of one multi-limited call.
MultiRateLimiter: Enforces several rate limits at once: an acquisition succeeds only if every applicable tier has capacity, and either all tiers are debited or none are.
Permit: A token consumed from a RateLimiter, marking the scope of one rate-limited call.
RateLimiter: A reusable token-bucket rate limiter.
RateLimiterPolicy: Policy parameters controlling a RateLimiter’s token bucket.
Reservation: A refundable debit against a Meter.
Tier: One tier in a MultiRateLimiter: a named Meter and the per-acquire cost charged to it.
WindowPermit: A unit consumed from a WindowedRateLimiter, marking the scope of one rate-limited call. Dropping it is a no-op — the window holds the count until it resets.
WindowPolicy: Policy for a WindowedRateLimiter: at most limit units per fixed window.
WindowedRateLimiter: A fixed-window rate limiter: admits up to limit units per window, with a hard reset.

Traits§

Meter: A throttle that admits or rejects weighted units, exposing enough state to compose several into a MultiRateLimiter and to answer “which limit blocked me, and for how long”.