Expand description
Rate limiting for API call throttling and concurrency control.
This module provides the RateLimiter for controlling both the concurrency and
frequency of API calls to prevent throttling and respect service limits.
§Main Types
RateLimiter: Dual-level rate limiter with semaphore-based concurrency control and time-based request throttling
§Features
- Separate rate limiting for LLM and embedding API calls
- Semaphore-based concurrency control (max N simultaneous calls)
- Time-based rate limiting (max N calls per second)
- Automatic waiting when limits are reached
- RAII-style permit handling with automatic release
- Health checking for congestion detection
- Per-second rate window with automatic reset
§Rate Limiting Strategy
The rate limiter implements a two-tier approach:
-
Concurrency Control: Uses semaphores to limit how many API calls can run simultaneously. This prevents overwhelming the system with too many parallel requests.
-
Time-Based Rate Limiting: Tracks requests per second and automatically waits when the limit is reached. The counter resets every second.
§Basic Usage
ⓘ
use graphrag_core::async_processing::{RateLimiter, AsyncConfig};
let config = AsyncConfig {
max_concurrent_llm_calls: 3,
llm_rate_limit_per_second: 2.0,
max_concurrent_embeddings: 5,
embedding_rate_limit_per_second: 10.0,
..Default::default()
};
let rate_limiter = RateLimiter::new(&config);
// Acquire permit for LLM call (blocks if needed)
let permit = rate_limiter.acquire_llm_permit().await?;
// ... make LLM API call ...
// Permit is automatically released when dropped
// Check available capacity
let available = rate_limiter.get_available_llm_permits();
println!("Available LLM permits: {}", available);
// Health check
let status = rate_limiter.health_check();Structs§
- Rate
Limiter - Rate limiter for controlling API call frequency and concurrency