Skip to main content

Crate grate_limiter

Crate grate_limiter 

Source
Expand description

§grate-limiter

Anticipatory rate-limit orchestration engine for multi-provider systems.

Predict limits before providers enforce them.

grate-limiter is not a retry library, proxy, or gateway. It is a predictive provider orchestration engine that routes traffic intelligently across providers by continuously learning their health, quota usage, and reliability — preventing rate-limit errors before they happen.

§Quick Start

use grate_limiter::{GrateLimiter, EngineConfig, ProviderConfig, CapabilityConfig};
use grate_limiter::{QuotaConfig, Dimension, Window, CapabilityProvider};
use grate_limiter::{Observation, Usage, Outcome, StatusClass};

// Create the engine
let engine = GrateLimiter::new(EngineConfig::default());

// Register providers with their quotas
engine.upsert_provider(ProviderConfig {
    name: "openai".into(),
    quotas: vec![
        QuotaConfig { dimension: Dimension::Requests, limit: 5000, window: Some(Window::Minute) },
    ],
    priority: 10,
    weight: 1.0,
    cooldown_seconds: 30,
});

engine.upsert_provider(ProviderConfig {
    name: "anthropic".into(),
    quotas: vec![
        QuotaConfig { dimension: Dimension::Requests, limit: 3000, window: Some(Window::Minute) },
    ],
    priority: 8,
    weight: 1.0,
    cooldown_seconds: 30,
});

// Register a capability with its providers
engine.upsert_capability(CapabilityConfig {
    name: "chat-completion".into(),
    providers: vec![
        CapabilityProvider { provider: "openai".into(), priority: 10 },
        CapabilityProvider { provider: "anthropic".into(), priority: 8 },
    ],
});

// Select the best provider
let decision = engine.select("chat-completion").unwrap();
println!("Use: {} (score: {:.2})", decision.provider, decision.score);

// Report what happened
engine.observe(Observation {
    provider: "openai".into(),
    capability: Some("chat-completion".into()),
    usage: Usage { requests: 1, tokens: Some(1200), ..Default::default() },
    outcome: Outcome { status: StatusClass::Success, latency_ms: 830 },
}).unwrap();

§Architecture

The engine has four major subsystems:

  • Quota Tracking: Token bucket / sliding window / fixed window / concurrency strategies
  • Health Engine: EWMA-based scoring with exponential decay and automatic cooldowns
  • Scoring Engine: Weighted composite scoring with anticipatory exhaustion prediction
  • Decision Engine: Deterministic, explainable provider selection

§Deterministic Testing

Use MockClock for fully deterministic, reproducible tests:

use grate_limiter::{GrateLimiter, EngineConfig, MockClock};
use std::sync::Arc;

let clock = Arc::new(MockClock::new());
let config = EngineConfig::default().with_clock(clock.clone());
let engine = GrateLimiter::new(config);

// Advance time deterministically
clock.advance_ms(1000);

Structs§

CapabilityConfig
Configuration for a capability (e.g., “chat-completion”, “image-generation”).
CapabilityProvider
A provider registered under a capability with its priority for that capability.
Decision
The result of a provider selection decision.
EngineConfig
Top-level engine configuration.
GrateLimiter
The main grate-limiter engine.
HealthConfig
Health engine configuration.
MockClock
Mock clock for deterministic testing.
Observation
An observation reported by the caller after a provider interaction.
Outcome
Outcome of a provider interaction.
ProviderConfig
Configuration for a provider.
QuotaConfig
Configuration for a single quota dimension on a provider.
RealClock
Real monotonic clock backed by std::time::Instant.
ScoreBreakdown
Detailed breakdown of how a provider was scored.
ScoringWeights
Weights for the composite scoring algorithm.
Timestamp
Monotonic timestamp in nanoseconds since engine creation.
Usage
Resource usage for a single interaction.

Enums§

Dimension
Quota dimension — what resource is being tracked.
Error
All errors produced by grate-limiter.
StatusClass
Classified response status for health tracking.
Window
Time window for quota reset.

Traits§

Clock
Clock abstraction for monotonic time.
ScoringStrategy
Trait for pluggable scoring strategies.

Type Aliases§

Result
Result type alias for grate-limiter operations.