Expand description
Response caching infrastructure for LLM API calls.
This module provides a flexible caching system to reduce API costs and improve response times by caching identical requests.
§Example
ⓘ
use llmkit::{CacheConfig, CachingProvider, InMemoryCache, OpenAIProvider};
// Create a caching provider
let inner = OpenAIProvider::from_env()?;
let cache = InMemoryCache::new(CacheConfig::default());
let provider = CachingProvider::new(inner, cache);
// First request hits the API
let response1 = provider.complete(request.clone()).await?;
// Second identical request hits the cache
let response2 = provider.complete(request).await?;§Cache Key Computation
Cache keys are computed from:
- Model name
- Messages content
- Tools (if any)
- System prompt
By default, non-deterministic parameters (temperature, top_p) are excluded from the cache key to allow caching regardless of sampling settings.
Structs§
- Cache
Config - Configuration for the caching system.
- Cache
KeyBuilder - Cache key builder for custom cache key computation.
- Cache
Stats - Statistics about cache performance.
- Cached
Response - A cached response with metadata.
- Caching
Provider - A provider wrapper that caches responses.
- InMemory
Cache - In-memory cache backend using DashMap.
Traits§
- Cache
Backend - Trait for cache backends.