Module rate_limiting

Expand description

Rate limiting for API call throttling and concurrency control.

This module provides the RateLimiter for controlling both the concurrency and frequency of API calls to prevent throttling and respect service limits.

§Main Types

RateLimiter: Dual-level rate limiter with semaphore-based concurrency control and time-based request throttling

§Features

Separate rate limiting for LLM and embedding API calls
Semaphore-based concurrency control (max N simultaneous calls)
Time-based rate limiting (max N calls per second)
Automatic waiting when limits are reached
RAII-style permit handling with automatic release
Health checking for congestion detection
Per-second rate window with automatic reset

§Rate Limiting Strategy

The rate limiter implements a two-tier approach:

Concurrency Control: Uses semaphores to limit how many API calls can run simultaneously. This prevents overwhelming the system with too many parallel requests.
Time-Based Rate Limiting: Tracks requests per second and automatically waits when the limit is reached. The counter resets every second.

§Basic Usage

use graphrag_core::async_processing::{RateLimiter, AsyncConfig};

let config = AsyncConfig {
    max_concurrent_llm_calls: 3,
    llm_rate_limit_per_second: 2.0,
    max_concurrent_embeddings: 5,
    embedding_rate_limit_per_second: 10.0,
    ..Default::default()
};

let rate_limiter = RateLimiter::new(&config);

// Acquire permit for LLM call (blocks if needed)
let permit = rate_limiter.acquire_llm_permit().await?;
// ... make LLM API call ...
// Permit is automatically released when dropped

// Check available capacity
let available = rate_limiter.get_available_llm_permits();
println!("Available LLM permits: {}", available);

// Health check
let status = rate_limiter.health_check();

Structs§

RateLimiter: Rate limiter for controlling API call frequency and concurrency

Module rate_limiting

Module rate_limiting Copy item path

§Main Types

§Features

§Rate Limiting Strategy

§Basic Usage

Structs§

Module rate_limiting