Expand description
Tower-style middleware for llmkit-rs.
Each layer wraps an llmkit_core::LlmProvider and is itself a provider, so
they compose into a single Arc<dyn LlmProvider> without Service<Request>
generics or sprawling where clauses. Layers:
RetryLayer— exponential backoff over retryable errorsRateLimitLayer— token-bucket throttling per providerCostTrackingLayer— per-request + cumulative cost, optional budget capTracingLayer— structured spans with latency and token counts
FallbackProvider chains providers primary → secondary on failure.
Structs§
- Cost
Tracking - Provider produced by
CostTrackingLayer. - Cost
Tracking Layer - Tracks per-request and cumulative cost; optionally enforces a budget.
- Fallback
Provider - Tries providers in order, advancing to the next on a retryable failure.
- Rate
Limit - Provider produced by
RateLimitLayer. - Rate
Limit Layer - Token-bucket rate limiter:
capacitytokens refilling overwindow. - Retry
- Provider produced by
RetryLayer. - Retry
Layer - Configures exponential-backoff retries for retryable errors.
- Session
Cost - Shared, cloneable handle to the running session cost (USD), in micro-dollars.
- Tracing
- Provider produced by
TracingLayer. - Tracing
Layer - Emits a
tracingspan around each call with latency and usage.
Traits§
- LlmLayer
- Wraps an inner
LlmProviderin a new one, adding cross-cutting behaviour.