llm-edge-routing
Intelligent routing engine for LLM Edge Agent, providing smart request distribution across multiple LLM providers with built-in resilience and failover capabilities.
Features
- Multiple Routing Strategies: Cost-based, latency-based, hybrid, and round-robin routing
- Circuit Breakers: Automatic failure detection and recovery to prevent cascading failures
- Failover Support: Seamless fallback to healthy providers when issues occur
- Performance Optimization: Intelligent load balancing based on real-time metrics
- Provider Agnostic: Works with any LLM provider through the
llm-edge-providersinterface - Async First: Built on Tokio for high-performance concurrent operations
- Observability: Integrated tracing and metrics for monitoring routing decisions
Installation
Add this to your Cargo.toml:
[]
= "0.1.0"
Routing Strategies
Cost-Based Routing
Routes requests to the provider with the lowest cost per token:
use RoutingStrategy;
let strategy = CostBased;
Use Case: Budget-conscious applications where cost optimization is the primary concern.
Latency-Based Routing
Routes requests to the fastest available provider based on historical latency:
use RoutingStrategy;
let strategy = LatencyBased;
Use Case: Real-time applications requiring the fastest possible response times.
Hybrid Routing
Routes based on multiple weighted factors (cost, latency, reliability):
use RoutingStrategy;
// Create hybrid strategy with custom weights
let strategy = Hybrid ;
// Or use default balanced weights
let strategy = default_hybrid;
Use Case: Production applications requiring balanced performance across multiple criteria.
Weight Guidelines:
cost_weight: 0.0-1.0 (higher = prioritize lower costs)latency_weight: 0.0-1.0 (higher = prioritize lower latency)reliability_weight: 0.0-1.0 (higher = prioritize higher uptime)- Weights should sum to approximately 1.0 for best results
Round-Robin Routing
Distributes requests evenly across all available providers:
use RoutingStrategy;
let strategy = RoundRobin;
Use Case: Testing, development, or uniform load distribution scenarios.
Usage Examples
Basic Routing Decision
use ;
async
Circuit Breaker Configuration
Prevent cascading failures by automatically opening the circuit after repeated failures:
use ;
use Duration;
async
Circuit Breaker States:
- Closed: Normal operation, requests flow through
- Open: Too many failures detected, failing fast to prevent cascading failures
- Half-Open: Timeout elapsed, testing if service has recovered
Configuration Parameters:
threshold: Number of consecutive failures before opening circuit (recommended: 3-5)timeout: Duration to wait before testing recovery (recommended: 30-60 seconds)
Error Handling
use ;
async
async
Complete Example with Failover
use ;
use Duration;
use HashMap;
async
Integration with LLM Edge Agent
This crate is designed to work seamlessly with the LLM Edge Agent ecosystem:
use RoutingStrategy;
use ProviderConfig;
// Configure providers
let provider_configs = vec!;
// Setup intelligent routing
let routing_strategy = default_hybrid;
// The routing engine will automatically select the best provider
// based on current costs, latency, and reliability metrics
Performance Characteristics
- Routing Decision: O(n) where n is the number of providers (typically < 10)
- Circuit Breaker State Check: O(1) atomic operations
- Memory Footprint: Minimal - approximately 100 bytes per circuit breaker
- Concurrency: Lock-free operations for circuit breaker state checks
Observability
The routing engine integrates with standard Rust observability tools:
// Tracing integration
info!;
// Metrics integration (via the metrics crate)
counter!;
License
Licensed under the Apache License, Version 2.0. See LICENSE for details.
Contributing
Contributions are welcome! Please see the Contributing Guide for details.
Related Crates
llm-edge-providers- Provider abstraction layerllm-edge-core- Core types and utilitiesllm-edge-protocol- Protocol definitions