rate-net 1.0.0

Stable (v1.0.0). The public API is frozen until 2.0. Five algorithms (token bucket by default; leaky bucket, fixed window, sliding-window log, and sliding-window counter under the algorithms feature) behind one Limiter trait and the Tier-2 builder; an optional await-until-ready async layer; runnable examples; a benchmark suite with an honest head-to-head vs governor. The concurrent core is a tunable sharded store where unrelated keys never contend, memory bounded by eviction, and an allocation-free steady-state check. The safety invariants are proved by proptest (per algorithm), loom, multi-threaded stress across every algorithm, an allocation audit, an adversarial-traffic suite, and a representative gatekeeper consumer; Send + Sync + 'static is locked in at the type level.

Rate limiting looks trivial until production traffic finds the edges — the over-admit race at a window boundary, the unbounded key map that OOMs under a unique-key flood, the API so generic you write a wrapper just to limit a loop. rate-net is built against all of them:

Lock-free per key. Each key's bucket delegates to better-bucket's atomic CAS core. No lock on the check path.
Sharded state. Per-key state lives in a sharded concurrent map; unrelated keys never serialize on each other. Throughput scales with cores.
Bounded memory. Idle keys are evicted (LRU/TTL); a hostile unique-key flood hits a cap, it does not grow unbounded.
Never over-admits. For any key and window, admitted requests never exceed the configured quota — under any interleaving. Proven per algorithm with loom and proptest.
Multiple algorithms, one trait. Token bucket, leaky bucket, fixed window, sliding-window log, sliding-window counter — all behind a single Limiter trait.
One-line API. RateLimiter::per_second(n) then .check(key). The simple path is also the fast path.
Honest retry-after. Denials carry how long until the caller may retry.

Features

Multi-algorithm — token bucket, leaky bucket, fixed window, sliding-window log, sliding-window counter, all behind one Limiter trait
Per-key limiting — limit per-IP, per-user, per-endpoint, per-anything; sharded so keys don't contend
Lock-free hot path — per-key buckets delegate to better-bucket's atomic core; zero allocation on the steady-state check
Bounded memory — LRU/TTL eviction caps the key space; survives a hostile unique-key flood
Retry-after — every denial reports when the caller may try again
Deterministic tests — inject a mockable clock (via clock-lib); advance windows without sleep
Tier-1 API — RateLimiter::per_second(n) + .check(key); a builder for control; a trait for the 1%
No over-admit guarantee — verified per algorithm with loom and proptest
Sync core — no async runtime dependency; optional additive async layer

Installation

Add to your Cargo.toml:

[dependencies]
rate-net = "1"

# Full algorithm suite + optional async layer:
rate-net = { version = "1", features = ["algorithms", "async"] }

Quick Start

use rate_net::{RateLimiter, Decision};

// 100 requests per second, per key.
let limiter = RateLimiter::per_second(100);

match limiter.check("user:42") {
    Decision::Allow => {
        // allowed — serve the request
    }
    Decision::Deny { retry_after } => {
        // denied — return 429 with Retry-After: {retry_after}
    }
}

One constructor, one method. The key can be anything hashable — an IP, a user id, an API token, an endpoint name.

Configured Limiter (Tier 2)

The builder selects the algorithm, quota, burst, shard count, and eviction policy in one fluent surface (algorithms other than the token bucket need the algorithms feature):

use rate_net::{RateLimiter, Algorithm, Eviction};
use std::time::Duration;

let limiter = RateLimiter::builder()
    .algorithm(Algorithm::SlidingWindowCounter)
    .quota(1000, Duration::from_secs(60)) // 1000 / minute
    .burst(50)                            // allow short bursts
    .shards(64)                           // tune for core count
    .eviction(Eviction::idle(Duration::from_secs(300)))
    .build();

// Take more than one unit at a time.
let decision = limiter.check_n("tenant:acme", 5);

Prefer not to use the builder? The same knobs are chainable adjusters on a constructed limiter — RateLimiter::with_quota(quota).with_shards(64) .with_eviction(policy).with_algorithm(algo) — each applied immediately after construction.

Algorithms

All algorithms share the same Limiter trait and check surface — swap the strategy through the builder without touching call sites. The token bucket is always available; the rest are behind the algorithms feature.

Algorithm	Shape	Good for
Token bucket	Smooth refill, allows bursts up to capacity	General-purpose; the default
Leaky bucket	Spaces units at a steady interval (GCRA)	Shaping bursty traffic to a steady rate
Fixed window	Counter resets each window	Cheapest; tolerates boundary bursts
Sliding-window log	Exact timestamps in a window	Highest accuracy; higher memory
Sliding-window counter	Weighted blend of two windows	Accuracy/cost balance; common production choice

The token bucket strategy is provided by better-bucket — rate-net does not reimplement it.

Deterministic Testing (mockable clock)

use rate_net::RateLimiter;
use clock_lib::ManualClock;
use std::sync::Arc;
use std::time::Duration;

// Share the clock with the limiter via `Arc`, then drive it by hand.
let clock = Arc::new(ManualClock::new());
let limiter = RateLimiter::per_second(5).with_clock(Arc::clone(&clock));

// Drain the key's quota.
for _ in 0..5 {
    assert!(limiter.check("k").is_allow());
}
assert!(limiter.check("k").is_deny());

// Advance one second — no real sleep — and the allowance is back.
clock.advance(Duration::from_secs(1));
assert!(limiter.check("k").is_allow());

Awaiting (async)

The core is sync and runtime-free — check never blocks. When you'd rather wait for a key to become allowed than shed the request, the optional async feature wraps a limiter in an AsyncLimiter whose until_ready awaits until the key is admitted (sleeping for the reported retry_after):

use rate_net::{AsyncLimiter, RateLimiter};

# async fn demo() {
let limiter = AsyncLimiter::new(RateLimiter::per_second(100));

// Non-blocking, allow/deny — same as the sync API.
let _ = limiter.check("user:42");

// Or await until the key is within its limit.
limiter.until_ready("user:42").await;
# }

Only this layer touches a runtime (tokio's timer), and only under the async feature; the core never depends on one.

Examples

Runnable, self-contained demos in examples/:

cargo run --example per_second    # Tier-1: N requests/second per key
cargo run --example per_key       # per-IP and per-user, independent allowances
cargo run --example mock_clock    # deterministic refill with a ManualClock
cargo run --example retry_after   # mapping a denial to HTTP 429 + Retry-After

cargo run --example algorithms --features algorithms  # compare all five algorithms
cargo run --example async_wait   --features async      # await until allowed

Feature Flags

Feature	Default	Description
`std`	✅	Standard library. Required for the sharded store and eviction.
`algorithms`	❌	The full suite beyond the default token bucket (leaky, fixed window, sliding-window log, sliding-window counter).
`async`	❌	`AsyncLimiter` with `until_ready` (await until a key is allowed). Additive — only this layer touches a runtime; the core does not.

# Everything:
rate-net = { version = "1", features = ["algorithms", "async"] }

Testing

# Unit + integration + property tests
cargo test --all-features

# Concurrency model checking (no over-admit under interleaving)
RUSTFLAGS="--cfg loom" cargo test --test loom_check

# Benchmarks
cargo bench --bench rate_bench

# Format + lints (must be clean)
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings

Performance

Criterion medians at the v1.0.0 tag (Windows x86_64, Rust stable, opt-level = 3):

Path	Median
Single-key check	~54 ns
Many-key check (64 shards)	~54 ns
Contended single key (4 threads)	~54 ns/op
Eviction sweep (cold insert at cap)	~217 ns

cargo bench --bench rate_bench

A head-to-head against governor is recorded honestly: governor is currently faster (~17 ns vs ~44 ns single-key), almost entirely because it reads a TSC-based quanta clock (~5 ns) while rate-net reads Instant::now() (~20 ns on Windows) through better-bucket / clock-lib. rate-net's own per-key overhead is competitive; closing the gap needs a faster monotonic clock in clock-lib. Full method, numbers, and analysis in docs/BENCHMARKS.md.

Design

Sharded, lock-free, bounded

Per-key state lives in a sharded concurrent map. A check is: hash the key → locate the shard → run the key's lock-free bucket. Unrelated keys land in different shards and never contend, so throughput scales with core count. The steady-state check allocates nothing; allocation happens only the first time a key is seen.

Memory is bounded by eviction. Idle keys are swept (LRU/TTL) on an amortized, incremental schedule that never stops the world on the hot path. A flood of unique keys reaches the eviction cap and stays there — it cannot grow without limit or corrupt live keys. This is a deliberate defense: an unbounded key map is a self-inflicted denial-of-service.

The no-over-admit invariant

The defining correctness property: for any key and window, the number of admitted requests never exceeds the configured quota — under any concurrent interleaving. Verified per algorithm:

loom explores the shard + per-key acquire interleavings and asserts no key is ever over-admitted.
proptest throws arbitrary interleavings of checks and time advances at each algorithm and asserts the per-window quota holds.

Clean boundary

rate-net exposes a decision API — Allow / Deny { retry_after } — and nothing else of its internals. Consumers (e.g. a gatekeeper like bouncer-io) call check and never reach into rate-net's per-key state. Library reuse, not state sharing.

The `-net` family

rate-net is the anchor of the -net domain group — independent, foreign-compatible crates that interoperate within a shared, hyper-optimized pipeline but never force each other as dependencies. rate-net consumes better-bucket for token-bucket math and clock-lib for time, and is itself consumed by gatekeepers via the clean check API. It works standalone, with no obligation to adopt the rest of the family.

Cross-Platform Support

Tier 1 Support:

✅ Linux (x86_64, aarch64)
✅ macOS (x86_64, Apple Silicon)
✅ Windows (x86_64)

Atomic and sharding behavior is identical across all three; the CI matrix runs every target on stable and MSRV.

Contributing

Contributions are welcome. Before opening a PR, make sure cargo fmt, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features are all clean. Any new algorithm must arrive with its own proptest over-admit proof and retry-after tests; any change to the shard or eviction path needs a loom test and a benchmark.