Expand description
§grate-limiter
Anticipatory rate-limit orchestration engine for multi-provider systems.
Predict limits before providers enforce them.
grate-limiter is not a retry library, proxy, or gateway. It is a predictive provider
orchestration engine that routes traffic intelligently across providers by continuously
learning their health, quota usage, and reliability — preventing rate-limit errors
before they happen.
§Quick Start
use grate_limiter::{GrateLimiter, EngineConfig, ProviderConfig, CapabilityConfig};
use grate_limiter::{QuotaConfig, Dimension, Window, CapabilityProvider};
use grate_limiter::{Observation, Usage, Outcome, StatusClass};
// Create the engine
let engine = GrateLimiter::new(EngineConfig::default());
// Register providers with their quotas
engine.upsert_provider(ProviderConfig {
name: "openai".into(),
quotas: vec![
QuotaConfig { dimension: Dimension::Requests, limit: 5000, window: Some(Window::Minute) },
],
priority: 10,
weight: 1.0,
cooldown_seconds: 30,
});
engine.upsert_provider(ProviderConfig {
name: "anthropic".into(),
quotas: vec![
QuotaConfig { dimension: Dimension::Requests, limit: 3000, window: Some(Window::Minute) },
],
priority: 8,
weight: 1.0,
cooldown_seconds: 30,
});
// Register a capability with its providers
engine.upsert_capability(CapabilityConfig {
name: "chat-completion".into(),
providers: vec![
CapabilityProvider { provider: "openai".into(), priority: 10 },
CapabilityProvider { provider: "anthropic".into(), priority: 8 },
],
});
// Select the best provider
let decision = engine.select("chat-completion").unwrap();
println!("Use: {} (score: {:.2})", decision.provider, decision.score);
// Report what happened
engine.observe(Observation {
provider: "openai".into(),
capability: Some("chat-completion".into()),
usage: Usage { requests: 1, tokens: Some(1200), ..Default::default() },
outcome: Outcome { status: StatusClass::Success, latency_ms: 830 },
}).unwrap();§Architecture
The engine has four major subsystems:
- Quota Tracking: Token bucket / sliding window / fixed window / concurrency strategies
- Health Engine: EWMA-based scoring with exponential decay and automatic cooldowns
- Scoring Engine: Weighted composite scoring with anticipatory exhaustion prediction
- Decision Engine: Deterministic, explainable provider selection
§Deterministic Testing
Use MockClock for fully deterministic, reproducible tests:
use grate_limiter::{GrateLimiter, EngineConfig, MockClock};
use std::sync::Arc;
let clock = Arc::new(MockClock::new());
let config = EngineConfig::default().with_clock(clock.clone());
let engine = GrateLimiter::new(config);
// Advance time deterministically
clock.advance_ms(1000);Structs§
- Capability
Config - Configuration for a capability (e.g., “chat-completion”, “image-generation”).
- Capability
Provider - A provider registered under a capability with its priority for that capability.
- Decision
- The result of a provider selection decision.
- Engine
Config - Top-level engine configuration.
- Grate
Limiter - The main grate-limiter engine.
- Health
Config - Health engine configuration.
- Mock
Clock - Mock clock for deterministic testing.
- Observation
- An observation reported by the caller after a provider interaction.
- Outcome
- Outcome of a provider interaction.
- Provider
Config - Configuration for a provider.
- Quota
Config - Configuration for a single quota dimension on a provider.
- Real
Clock - Real monotonic clock backed by
std::time::Instant. - Score
Breakdown - Detailed breakdown of how a provider was scored.
- Scoring
Weights - Weights for the composite scoring algorithm.
- Timestamp
- Monotonic timestamp in nanoseconds since engine creation.
- Usage
- Resource usage for a single interaction.
Enums§
- Dimension
- Quota dimension — what resource is being tracked.
- Error
- All errors produced by grate-limiter.
- Status
Class - Classified response status for health tracking.
- Window
- Time window for quota reset.
Traits§
- Clock
- Clock abstraction for monotonic time.
- Scoring
Strategy - Trait for pluggable scoring strategies.
Type Aliases§
- Result
- Result type alias for grate-limiter operations.