EdgeQuake LLM
A unified Rust library providing LLM and embedding provider abstraction with support for multiple backends, intelligent caching, rate limiting, and cost tracking.
Features
- ๐ค 9 LLM Providers: OpenAI, Anthropic, Gemini, xAI, OpenRouter, Ollama, LMStudio, HuggingFace, VSCode Copilot
- ๐ฆ Response Caching: Reduce costs with intelligent caching (memory + persistent)
- โก Rate Limiting: Built-in API rate limit management with exponential backoff
- ๐ฐ Cost Tracking: Session-level cost monitoring and metrics
- ๐ Retry Logic: Automatic retry with configurable strategies
- ๐ฏ Reranking: BM25, RRF, and hybrid reranking strategies
- ๐ Observability: OpenTelemetry integration for metrics and tracing
- ๐งช Testing: Mock provider for unit tests
Quick Start
Add to your Cargo.toml:
[]
= "0.2"
= { = "1.0", = ["full"] }
Basic Usage
use ;
async
Supported Providers
| Provider | Models | Streaming | Embeddings | Tool Use |
|---|---|---|---|---|
| OpenAI | GPT-4, GPT-5 | โ | โ | โ |
| Anthropic | Claude 3+, 4 | โ | โ | โ |
| Gemini | Gemini 2.0+, 3.0 | โ | โ | โ |
| xAI | Grok 2, 3, 4 | โ | โ | โ |
| OpenRouter | 616+ models | โ | โ | โ |
| Ollama | Local models | โ | โ | โ |
| LMStudio | Local models | โ | โ | โ |
| HuggingFace | Open-source | โ | โ | โ ๏ธ |
| Copilot | GitHub models | โ | โ | โ |
Examples
Multi-Provider Abstraction
use ;
async
Response Caching
use ;
let provider = from_env;
let cache_config = CacheConfig ;
let cached = new;
// Subsequent identical requests served from cache
Cost Tracking
use SessionCostTracker;
let tracker = new;
// After each completion
tracker.add_completion;
// Get summary
let summary = tracker.summary;
println!;
Rate Limiting
use ;
let config = RateLimiterConfig ;
let limited = new;
// Automatic rate limiting with exponential backoff
Provider Setup
OpenAI
let provider = new;
// or
let provider = from_env;
Anthropic
let provider = from_env;
Gemini
let provider = from_env;
OpenRouter
let provider = new;
Local Providers
// Ollama (assumes running on localhost:11434)
let provider = new;
// LMStudio (assumes running on localhost:1234)
let provider = new;
Advanced Features
OpenTelemetry Integration
Enable with otel feature:
= { = "0.2", = ["otel"] }
use TracingProvider;
let provider = from_env;
let traced = new;
// Automatic span creation and GenAI semantic conventions
Reranking
use ;
let reranker = new;
let results = reranker.rerank.await?;
Documentation
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT license (LICENSE-MIT)
at your option.
Credits
Extracted from the EdgeCode project, a Rust coding agent with OODA loop decision framework.