quota 0.3.2

Fastest Lane-parallel Rate-limiter for Rust
Documentation
# Quota


A high-performance in-memory rate limiter for Rust, using a mix of Leaky Token Bucket & GCRA.


## Quick Comparison


>Benchmark numbers are Criterion-reported ns/op from the local harness (You will find in `./benches/limiters.rs`) which uses 256 Tokio tasks on 16 Tokio worker threads, pre-warmed allow-path state. They are wall-clock elapsed time divided by total operations across concurrent tasks, not single-core instruction latency. Just noting this to avoid confusion from the numbers!


| Axis                              | `quota` | `governor` | `flux_limiter` | `tokio_rate_limit` | `ratelimit` | `leaky_bucket` |
|-----------------------------------|---:|---:|---:|---:|---:|---:|
| **Same-Key Throughput**           | **46.94 ns** | 107.49 ns | 201.96 ns | 112.23 ns | Not keyed | Not keyed |
| **Distributed-Key Throughput**    | **1.95 ns** | 7.73 ns | 11.11 ns | 9.04 ns | Not keyed | Not keyed |
| **Single Limiter Throughput**     | **47.34 ns** | 74.82 ns | Uses keyed path: 201.96 ns hot | Uses keyed path: 112.23 ns hot | 82.65 ns | 159.45 ns |
| **Refill Interval**               | Yes: `set_refill_interval` | No: rate period is GCRA cell spacing, not batch refill ticks | No: only `rate_nanos` + burst tolerance | No: rate/sec + burst; elapsed refill | No: scaled continuous refill | Yes: `refill(...)` + `interval(...)` |
| **Algorithm**                     | GCRA pool; legacy direct token counter still present | GCRA | GCRA | Token bucket default; custom algorithms | Token bucket | Token/leaky bucket |
| **Keyed Limiting**                | Yes | Yes | Yes | Yes | No | No |
| **Direct / Global Limiting**      | Yes, via `QuotaPool` single key; legacy `Quota` counter | Yes | No direct type; use constant key | No direct type; use constant key | Yes | Yes |
| **Weighted Costs**                | Yes | Yes: `check_n` | No: one request per call | Yes: `check_with_cost` | Yes: `try_wait_n` | Yes: `acquire(n)` / `try_acquire(n)` |
| **Async Wait / Backpressure API** | No | Yes: `until_ready` style APIs | No | Yes: `acquire` and `acquire_timeout` | No; returns retry duration | Yes: `acquire` futures |
| **Nonblocking Try API**           | Yes: `check` / `consume` | Yes: `check` / `check_n` | Yes: `check_request` | Yes: `check`, `try_acquire`, `try_acquire_n` | Yes: `try_wait_n` | Yes: `try_acquire` |
| **Built-In Web Middleware**       | No | No framework middleware; has stream/sink helpers and middleware hooks | No | Yes: Axum/Tower/Tonic | No | No |
| **Denial Metadata**               | Available tokens only | Retry time + optional state middleware | Retry-after, remaining, reset | Retry-after, remaining, limit, reset | Retry duration | Boolean or wait future |
| **Key Cleanup / Eviction**        | Manual `remove` | `retain_recent` + `shrink_to_fit` | `cleanup_stale_clients` | TTL on algorithm types | Not keyed | Not keyed |
| **Custom Clock**                  | No public clock injection | Yes | Yes | No | Yes | No |
| **`no_std` Support**              | No | Yes | No | No | Yes | Crate is `no_std`, but depends on Tokio timing for operation |


### Quick Axum Example using `quota`


```rust
use axum::{Router, extract::{Path, State}, http::StatusCode, routing::get};
use quota::{QuotaPolicy, QuotaPool, RefillRate};
use std::{net::SocketAddr, sync::Arc};

type Limiter = Arc<QuotaPool<String>>;

#[tokio::main]

async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let policy = QuotaPolicy::new()
        .set_capacity(10.0)
        .set_refill_rate(RefillRate::per_sec(1));
    
    let limiter = Arc::new(QuotaPool::new(policy, 10));

    let app = Router::new()
        .route("/{key}", get(limit))
        .with_state(limiter);

    let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
    axum::serve(tokio::net::TcpListener::bind(addr).await?, app).await?;
    Ok(())
}

async fn limit(State(limiter): State<Limiter>, Path(key): Path<String>) -> StatusCode {
    match limiter.consume(key.as_str(), 1) {
        Ok(_) => StatusCode::OK,
        Err(_) => StatusCode::TOO_MANY_REQUESTS,
    }
}
```

### API


We provide 3 essential primitives: standalone `Quota`, `QuotaPolicy`, and a GCRA-based `QuotaPool`.

`QuotaPool` defaults to `QuotaKey`, an owned heap `String`.
If your quota identity is already a compact ID, interned symbol, or another key shape,
use `QuotaPool<K>` and construct it with `QuotaPool::<K>::with_key_type(...)`.
Use `QuotaPool::with_capacity(...)` and `pool.insert_keys(...)` when the key set is known ahead of traffic; that keeps the hot request path on borrowed-key lookup instead of insertion.

Example use of the simple `Quota` (A simple 8-byte number in memory):
```rust
use quota::Quota;

fn main() {
    let quota = Quota::with_initial_tokens(10);

    let mut results = vec![];
    for _ in 0..100 {
        results.push(quota.consume(1)); // 10..9..8..7..6..5..4..3..2..1..Err
    }

    assert_eq!(results.iter().filter(|r| r.is_ok()).count(), 10); // 10 Ok: 10..=1
    assert_eq!(results.iter().filter(|r| r.is_err()).count(), 90); // 90 Err: Rate-limited
}
```

Example use of applying `QuotaPolicy` with a maximum capacity and `RefillRate`:
```rust
use quota::{Quota, QuotaPolicy, RefillRate};

fn main() {
    let policy = QuotaPolicy::new()
        .set_capacity(10.0) // Maximum Capacity to apply to a Quota per tick
        .set_refill_rate(RefillRate::per_micro(100.0)); // Refill Rate to apply to a Quota per tick (0.1T/ns)

    let quota = Quota::with_initial_tokens(10);

    let mut results = vec![];
    for _ in 0..100 {
        policy.tick(1, &mut quota); // dt = 1ns => 1ns*(0.1T/ns) = 0.1 tokens per tick() call
        results.push(quota.consume(1));
    }

    assert_eq!(results.iter().filter(|r| r.is_ok()).count(), 19);
    assert_eq!(results.iter().filter(|r| r.is_err()).count(), 81);
}
```

And now the main `QuotaPool`:
```rust
use quota::{RefillRate, QuotaPolicy, QuotaPool};
use std::sync::Arc;
use std::time::Duration;

fn main() {
    let policy = QuotaPolicy::new()
        .set_capacity(10.0)
        .set_refill_rate(RefillRate::per_sec(3))
        .set_refill_interval(Duration::from_secs(1)); // It will not tick until this amount passes between every tick

    /// QuotaPool uses the System's own clock and ticks the quotas with the time difference between every tick.
    /// A "QuotaPolicy::set_refill_interval" would prevent a tick from happening if internal last_tick_time < refill_interval
    let pool = Arc::new(QuotaPool::with_capacity(policy, 10, 1));

    let mut results = vec![];
    for _ in 0..100 {
        results.push(pool.consume("testing", 1));
    }
}

```