Skip to main content

Module throttle

Module throttle 

Source
Expand description

Concurrency control for LLM API calls.

This module provides rate limiting and concurrency control to prevent overwhelming LLM API endpoints:

  • Rate Limiter — Token bucket algorithm to limit requests per time period
  • Concurrency Controller — Combined semaphore + rate limiter

§Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        LlmClient                                 │
│                                                                  │
│   complete() ──▶ [Rate Limiter] ──▶ [Semaphore] ──▶ API Call   │
│                         │                │                       │
│                    令牌桶限制        并发数限制                   │
│                                                                  │
│   ┌─────────────────────────────────────────────────────────┐  │
│   │              ConcurrencyController                       │  │
│   │                                                          │  │
│   │  ┌─────────────┐  ┌─────────────┐                        │  │
│   │  │RateLimiter  │  │ Semaphore   │                        │  │
│   │  │(governor)   │  │(tokio)      │                        │  │
│   │  └─────────────┘  └─────────────┘                        │  │
│   └─────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

§Example

use vectorless::throttle::{ConcurrencyController, ConcurrencyConfig};

// Create with default configuration
let controller = ConcurrencyController::with_defaults();

// Or customize
let config = ConcurrencyConfig::new()
    .with_max_concurrent_requests(20)
    .with_requests_per_minute(1000);
let controller = ConcurrencyController::new(config);

// Before making an API call
let permit = controller.acquire().await;

// Make the API call...
// Permit is automatically released when dropped

Structs§

ConcurrencyConfig
Concurrency control configuration.
ConcurrencyController
Concurrency controller for LLM API calls.
RateLimiter
Rate limiter for API calls.