trypema 0.1.0-dev.5

High-performance rate limiting primitives in Rust, designed for concurrency safety, low overhead, and predictable latency.
Documentation

Trypema Rate Limiter

Status: in development (pre-release).

Name and Biblical Inspiration

The name is inspired by the Koine Greek word "τρυπήματος" (trypematos, "hole/opening") from the phrase "διὰ τρυπήματος ῥαφίδος" ("through the eye of a needle") in the Bible: Matthew 19:24, Mark 10:25, Luke 18:25

Overview

Trypema provides rate limiting primitives designed for multi-threaded, in-process use with low overhead and predictable latency characteristics.

What you get today:

  • A RateLimiter facade that exposes a local provider.
  • A deterministic sliding-window strategy (absolute) and a suppression-capable strategy (suppressed).

What this crate is not (currently):

  • A distributed/shared rate limiter (Redis, memcached, etc.).
  • A strict/linearizable admission controller under high concurrency.

Status

  • local provider: implemented
  • additional providers: planned/experimental

Quick Start

use trypema::{
    HardLimitFactor, LocalRateLimiterOptions, RateGroupSizeMs, RateLimit, RateLimitDecision,
    RateLimiter, RateLimiterOptions, WindowSizeSeconds,
};

let rl = RateLimiter::new(RateLimiterOptions {
    local: LocalRateLimiterOptions {
        window_size_seconds: WindowSizeSeconds::try_from(60).unwrap(),
        rate_group_size_ms: RateGroupSizeMs::try_from(10).unwrap(),
        hard_limit_factor: HardLimitFactor::default(),
    },
});

let key = "user:123";
let rate_limit = RateLimit::try_from(5.0).unwrap();

// check + record work (count is usually 1)
match rl.local().absolute().inc(key, &rate_limit, 1) {
    RateLimitDecision::Allowed => {
        // proceed
    }
    RateLimitDecision::Rejected {
        window_size_seconds,
        retry_after_ms,
        remaining_after_waiting,
    } => {
        let _ = (window_size_seconds, retry_after_ms, remaining_after_waiting);
        // reject / retry later
    }
    RateLimitDecision::Suppressed { is_allowed, .. } => {
        // only returned by strategies that support suppression
        if is_allowed {
            // proceed
        } else {
            // suppressed / retry later
        }
    }
}

Core Concepts

  • Keyed limiting: each key has independent state.
  • RateLimit: per-second limit for a key (positive f64, so non-integer limits are allowed).
  • Sliding window: admission is based on the last window_size_seconds of history.
  • Bucket coalescing: increments close together can be merged into time buckets to reduce overhead.

Configuration

LocalRateLimiterOptions:

  • window_size_seconds: sliding window length used for admission.
  • rate_group_size_ms: coalescing interval for increments close in time.
  • hard_limit_factor: used by the suppressed strategy as a hard cutoff multiplier.

Decisions

All strategies return RateLimitDecision:

  • Allowed: proceed; the increment was applied.
  • Rejected { window_size_seconds, retry_after_ms, remaining_after_waiting }: do not proceed; includes best-effort backoff hints.
  • Suppressed { suppression_factor, is_allowed }: returned by suppression-based strategies; treat is_allowed as the admission decision.

Notes on metadata:

  • retry_after_ms is computed from the oldest in-window bucket, so it is best-effort (especially with coalescing and concurrency).
  • remaining_after_waiting is also best-effort; if usage is heavily coalesced into one bucket it can be 0.

Local Strategies

Absolute (rl.local().absolute())

Deterministic sliding-window limiter with per-key state stored in-process.

Behavior:

  • Window capacity is approximately W * R (window seconds W times per-second limit R).
  • Per-key limit is sticky: the first call for a key stores the RateLimit; later calls for that key do not update it.

Good for:

  • simple per-key rate caps
  • low overhead checks in a single process

Suppressed (rl.local().suppressed())

Strategy that can probabilistically deny work while tracking both:

  • observed usage (all calls)
  • accepted usage (only admitted calls)

This strategy can return RateLimitDecision::Suppressed to expose suppression metadata. It also enforces a hard cutoff:

  • hard cutoff: rate_limit * hard_limit_factor
  • hitting the hard cutoff returns Rejected (a hard rejection, not suppressible)

Suppression activation:

  • Suppression is only considered once accepted usage meets/exceeds the base window capacity (window_size_seconds * rate_limit).
  • Below that capacity, suppression is bypassed (calls return Allowed, subject to the hard cutoff).

Inspiration:

Semantics (Important)

  • Best-effort under concurrency: inc does an admission check and then applies the increment. Under high contention, several threads can observe Allowed and increment concurrently, so temporary overshoot is possible.
  • Eviction granularity: eviction uses Instant::elapsed().as_secs() (whole-second truncation). This is conservative; e.g. a 1s window can effectively require ~2s before a bucket is considered expired.
  • Key cardinality: keys are not automatically removed from the internal map; unbounded/attacker-controlled keys can grow memory usage.

Practical Tuning

  • window_size_seconds: larger windows smooth bursts but increase the amount of history affecting admission/unblocking.
  • rate_group_size_ms: larger values reduce overhead by coalescing increments into fewer buckets, but make rejection metadata coarser.

Crate Layout

  • src/rate_limiter.rs: RateLimiter facade and options
  • src/local/absolute_local_rate_limiter.rs: absolute local implementation
  • src/local/suppressed_local_rate_limiter.rs: suppression-capable local implementation
  • src/common.rs: shared types (RateLimitDecision, newtypes, internal series)

Roadmap

Planned directions (subject to change):

  • additional providers (shared/distributed state)
  • additional strategies and tighter semantics where needed