Tower-style middleware for llmkit-rs.
Each layer wraps an [llmkit_core::LlmProvider] and is itself a provider, so
they compose into a single Arc<dyn LlmProvider> without Service<Request>
generics or sprawling where clauses. Layers:
- [
RetryLayer] — exponential backoff over retryable errors - [
RateLimitLayer] — token-bucket throttling per provider - [
CostTrackingLayer] — per-request + cumulative cost, optional budget cap - [
TracingLayer] — structured spans with latency and token counts
[FallbackProvider] chains providers primary → secondary on failure.