quota 0.3.3

Fastest Lane-parallel Rate-limiter for Rust
Documentation
# Quota


A high-performance in-memory rate limiter for Rust, using a mix of Leaky Token Bucket & GCRA.


## Quick Comparison


>Benchmark numbers are Criterion-reported ns/op from the local harness (You will find in `./benches/limiters.rs`) which uses 256 Tokio tasks on 16 Tokio worker threads, pre-warmed allow-path state. They are wall-clock elapsed time divided by total operations across concurrent tasks, not single-core instruction latency. To avoid confusion with the numbers, I added the more normal op / s throughput.


| Axis                              |                                                 `quota` |                                                            `governor` |                          `flux_limiter` |                           `tokio_rate_limit` | `ratelimit` | `leaky_bucket` |
|-----------------------------------|--------------------------------------------------------:|----------------------------------------------------------------------:|----------------------------------------:|---------------------------------------------:|---:|---:|
| **Same-Key Throughput**           |                                 **46.94 ns (21 Mop/s)** |                                                   107.49 ns (9 Mop/s) |                     201.96 ns (5 Mop/s) |                          112.23 ns (9 Mop/s) | Not keyed | Not keyed |
| **Distributed-Key Throughput**    |                                 **1.95 ns (513 Mop/s)** |                                                   7.73 ns (129 Mop/s) |                     11.11 ns (90 Mop/s) |                          9.04 ns (110 Mop/s) | Not keyed | Not keyed |
| **Single Limiter Throughput**     |                                 **47.34 ns (21 Mop/s)** |                                                   74.82 ns (13 Mop/s) |          Uses keyed path: 201.96 ns hot |               Uses keyed path: 112.23 ns hot | 82.65 ns | 159.45 ns |
| **Refill Interval**               |                              Yes: `set_refill_interval` |          No: rate period is GCRA cell spacing, not batch refill ticks | No: only `rate_nanos` + burst tolerance |         No: rate/sec + burst; elapsed refill | No: scaled continuous refill | Yes: `refill(...)` + `interval(...)` |
| **Algorithm**                     |    GCRA pool; legacy direct token counter still present |                                                                  GCRA |                                    GCRA |      Token bucket default; custom algorithms | Token bucket | Token/leaky bucket |
| **Keyed Limiting**                |                                                     Yes |                                                                   Yes |                                     Yes |                                          Yes | No | No |
| **Direct / Global Limiting**      | Yes, via `QuotaPool` single key; legacy `Quota` counter |                                                                   Yes |        No direct type; use constant key |             No direct type; use constant key | Yes | Yes |
| **Weighted Costs**                |                                                     Yes |                                                        Yes: `check_n` |                No: one request per call |                       Yes: `check_with_cost` | Yes: `try_wait_n` | Yes: `acquire(n)` / `try_acquire(n)` |
| **Async Wait / Backpressure API** |                                                      No |                                         Yes: `until_ready` style APIs |                                      No |         Yes: `acquire` and `acquire_timeout` | No; returns retry duration | Yes: `acquire` futures |
| **Nonblocking Try API**           |                                Yes: `check` / `consume` |                                              Yes: `check` / `check_n` |                    Yes: `check_request` | Yes: `check`, `try_acquire`, `try_acquire_n` | Yes: `try_wait_n` | Yes: `try_acquire` |
| **Built-In Web Middleware**       |                                                      No | No framework middleware; has stream/sink helpers and middleware hooks |                                      No |                        Yes: Axum/Tower/Tonic | No | No |
| **Denial Metadata**               |                                   Available tokens only |                                Retry time + optional state middleware |           Retry-after, remaining, reset |         Retry-after, remaining, limit, reset | Retry duration | Boolean or wait future |
| **Key Cleanup / Eviction**        |                                         Manual `remove` |                                     `retain_recent` + `shrink_to_fit` |                 `cleanup_stale_clients` |                       TTL on algorithm types | Not keyed | Not keyed |
| **Custom Clock**                  |                               No public clock injection |                                                                   Yes |                                     Yes |                                           No | Yes | No |
| **`no_std` Support**              |                                                      No |                                                                   Yes |                                      No |                                           No | Yes | Crate is `no_std`, but depends on Tokio timing for operation |


### Quick Axum Example using `quota`


```rust
use axum::{Router, extract::{Path, State}, http::StatusCode, routing::get};
use quota::{QuotaPolicy, QuotaPool, RefillRate};
use std::{net::SocketAddr, sync::Arc};

type Limiter = Arc<QuotaPool<String>>;

#[tokio::main]

async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let policy = QuotaPolicy::new()
        .set_capacity(10.0)
        .set_refill_rate(RefillRate::per_sec(1));
    
    let limiter = Arc::new(QuotaPool::new(policy, 10));

    let app = Router::new()
        .route("/{key}", get(limit))
        .with_state(limiter);

    let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
    axum::serve(tokio::net::TcpListener::bind(addr).await?, app).await?;
    Ok(())
}

async fn limit(State(limiter): State<Limiter>, Path(key): Path<String>) -> StatusCode {
    match limiter.consume(key.as_str(), 1) {
        Ok(_) => StatusCode::OK,
        Err(_) => StatusCode::TOO_MANY_REQUESTS,
    }
}
```

### API


We provide 3 essential primitives: standalone `Quota`, `QuotaPolicy`, and a GCRA-based `QuotaPool`.

`QuotaPool` defaults to `QuotaKey`, an owned heap `String`.
If your quota identity is already a compact ID, interned symbol, or another key shape,
use `QuotaPool<K>` and construct it with `QuotaPool::<K>::with_key_type(...)`.
Use `QuotaPool::with_capacity(...)` and `pool.insert_keys(...)` when the key set is known ahead of traffic; that keeps the hot request path on borrowed-key lookup instead of insertion.

Example use of the simple `Quota` (A simple 8-byte number in memory):
```rust
use quota::Quota;

fn main() {
    let quota = Quota::with_initial_tokens(10);

    let mut results = vec![];
    for _ in 0..100 {
        results.push(quota.consume(1)); // 10..9..8..7..6..5..4..3..2..1..Err
    }

    assert_eq!(results.iter().filter(|r| r.is_ok()).count(), 10); // 10 Ok: 10..=1
    assert_eq!(results.iter().filter(|r| r.is_err()).count(), 90); // 90 Err: Rate-limited
}
```

Example use of applying `QuotaPolicy` with a maximum capacity and `RefillRate`:
```rust
use quota::{Quota, QuotaPolicy, RefillRate};

fn main() {
    let policy = QuotaPolicy::new()
        .set_capacity(10.0) // Maximum Capacity to apply to a Quota per tick
        .set_refill_rate(RefillRate::per_micro(100.0)); // Refill Rate to apply to a Quota per tick (0.1T/ns)

    let quota = Quota::with_initial_tokens(10);

    let mut results = vec![];
    for _ in 0..100 {
        policy.tick(1, &mut quota); // dt = 1ns => 1ns*(0.1T/ns) = 0.1 tokens per tick() call
        results.push(quota.consume(1));
    }

    assert_eq!(results.iter().filter(|r| r.is_ok()).count(), 19);
    assert_eq!(results.iter().filter(|r| r.is_err()).count(), 81);
}
```

And now the main `QuotaPool`:
```rust
use quota::{RefillRate, QuotaPolicy, QuotaPool};
use std::sync::Arc;
use std::time::Duration;

fn main() {
    let policy = QuotaPolicy::new()
        .set_capacity(10.0)
        .set_refill_rate(RefillRate::per_sec(3))
        .set_refill_interval(Duration::from_secs(1)); // It will not tick until this amount passes between every tick

    /// QuotaPool uses the System's own clock and ticks the quotas with the time difference between every tick.
    /// A "QuotaPolicy::set_refill_interval" would prevent a tick from happening if internal last_tick_time < refill_interval
    let pool = Arc::new(QuotaPool::with_capacity(policy, 10, 1));

    let mut results = vec![];
    for _ in 0..100 {
        results.push(pool.consume("testing", 1));
    }
}

```