edgequake-llm 0.2.0

Multi-provider LLM abstraction library with caching, rate limiting, and cost tracking
Documentation
# EdgeQuake LLM

[![Crates.io](https://img.shields.io/crates/v/edgequake-llm.svg)](https://crates.io/crates/edgequake-llm)
[![Documentation](https://docs.rs/edgequake-llm/badge.svg)](https://docs.rs/edgequake-llm)
[![License](https://img.shields.io/badge/license-MIT%2FApache--2.0-blue.svg)](LICENSE-MIT)
[![CI](https://github.com/raphaelmansuy/edgequake-llm/workflows/CI/badge.svg)](https://github.com/raphaelmansuy/edgequake-llm/actions)

A unified Rust library providing LLM and embedding provider abstraction with support for multiple backends, intelligent caching, rate limiting, and cost tracking.

## Features

- ๐Ÿค– **9 LLM Providers**: OpenAI, Anthropic, Gemini, xAI, OpenRouter, Ollama, LMStudio, HuggingFace, VSCode Copilot
- ๐Ÿ“ฆ **Response Caching**: Reduce costs with intelligent caching (memory + persistent)
- โšก **Rate Limiting**: Built-in API rate limit management with exponential backoff
- ๐Ÿ’ฐ **Cost Tracking**: Session-level cost monitoring and metrics
- ๐Ÿ”„ **Retry Logic**: Automatic retry with configurable strategies
- ๐ŸŽฏ **Reranking**: BM25, RRF, and hybrid reranking strategies
- ๐Ÿ“Š **Observability**: OpenTelemetry integration for metrics and tracing
- ๐Ÿงช **Testing**: Mock provider for unit tests

## Quick Start

Add to your `Cargo.toml`:

```toml
[dependencies]
edgequake-llm = "0.2"
tokio = { version = "1.0", features = ["full"] }
```

### Basic Usage

```rust
use edgequake_llm::{OpenAIProvider, LLMProvider, ChatMessage, ChatRole};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize provider
    let provider = OpenAIProvider::new("your-api-key", "gpt-4");

    // Create message
    let messages = vec![
        ChatMessage {
            role: ChatRole::User,
            content: "What is Rust?".to_string(),
            ..Default::default()
        }
    ];

    // Get completion
    let response = provider.complete(&messages, None).await?;
    println!("{}", response.content);

    Ok(())
}
```

## Supported Providers

| Provider | Models | Streaming | Embeddings | Tool Use |
|----------|--------|-----------|------------|----------|
| OpenAI | GPT-4, GPT-5 | โœ… | โœ… | โœ… |
| Anthropic | Claude 3+, 4 | โœ… | โŒ | โœ… |
| Gemini | Gemini 2.0+, 3.0 | โœ… | โœ… | โœ… |
| xAI | Grok 2, 3, 4 | โœ… | โŒ | โœ… |
| OpenRouter | 616+ models | โœ… | โŒ | โœ… |
| Ollama | Local models | โœ… | โœ… | โœ… |
| LMStudio | Local models | โœ… | โœ… | โœ… |
| HuggingFace | Open-source | โœ… | โŒ | โš ๏ธ |
| Copilot | GitHub models | โœ… | โŒ | โœ… |

## Examples

### Multi-Provider Abstraction

```rust
use edgequake_llm::{LLMProvider, OpenAIProvider, AnthropicProvider};

async fn try_providers() -> Result<(), Box<dyn std::error::Error>> {
    let providers: Vec<Box<dyn LLMProvider>> = vec![
        Box::new(OpenAIProvider::from_env()),
        Box::new(AnthropicProvider::from_env()),
    ];

    for provider in providers {
        println!("Testing: {}", provider.name());
        // Use provider...
    }

    Ok(())
}
```

### Response Caching

```rust
use edgequake_llm::{OpenAIProvider, CachedProvider, CacheConfig};

let provider = OpenAIProvider::from_env();
let cache_config = CacheConfig {
    ttl_seconds: 3600,  // 1 hour
    max_entries: 1000,
};

let cached = CachedProvider::new(provider, cache_config);
// Subsequent identical requests served from cache
```

### Cost Tracking

```rust
use edgequake_llm::SessionCostTracker;

let tracker = SessionCostTracker::new();

// After each completion
tracker.add_completion(
    "openai",
    "gpt-4",
    prompt_tokens,
    completion_tokens,
);

// Get summary
let summary = tracker.summary();
println!("Total cost: ${:.4}", summary.total_cost);
```

### Rate Limiting

```rust
use edgequake_llm::{RateLimitedProvider, RateLimiterConfig};

let config = RateLimiterConfig {
    max_requests_per_minute: 60,
    max_tokens_per_minute: 100_000,
};

let limited = RateLimitedProvider::new(provider, config);
// Automatic rate limiting with exponential backoff
```

## Provider Setup

### OpenAI

```bash
export OPENAI_API_KEY=sk-...
```

```rust
let provider = OpenAIProvider::new("your-key", "gpt-4");
// or
let provider = OpenAIProvider::from_env();
```

### Anthropic

```bash
export ANTHROPIC_API_KEY=sk-ant-...
```

```rust
let provider = AnthropicProvider::from_env();
```

### Gemini

```bash
export GOOGLE_API_KEY=...
```

```rust
let provider = GeminiProvider::from_env();
```

### OpenRouter

```bash
export OPENROUTER_API_KEY=sk-or-v1-...
```

```rust
let provider = OpenRouterProvider::new("your-key");
```

### Local Providers

```rust
// Ollama (assumes running on localhost:11434)
let provider = OllamaProvider::new("http://localhost:11434");

// LMStudio (assumes running on localhost:1234)
let provider = LMStudioProvider::new("http://localhost:1234");
```

## Advanced Features

### OpenTelemetry Integration

Enable with `otel` feature:

```toml
edgequake-llm = { version = "0.2", features = ["otel"] }
```

```rust
use edgequake_llm::TracingProvider;

let provider = OpenAIProvider::from_env();
let traced = TracingProvider::new(provider, "my-service");
// Automatic span creation and GenAI semantic conventions
```

### Reranking

```rust
use edgequake_llm::{BM25Reranker, Reranker};

let reranker = BM25Reranker::new();
let results = reranker.rerank(query, documents, top_k).await?;
```

## Documentation

- [API Documentation]https://docs.rs/edgequake-llm
- [Examples]examples/
- [Provider Guides]docs/providers.md
- [Caching Strategies]docs/caching.md
- [Cost Tracking]docs/cost-tracking.md

## Contributing

Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

## License

Licensed under either of:

- Apache License, Version 2.0 ([LICENSE-APACHE]LICENSE-APACHE)
- MIT license ([LICENSE-MIT]LICENSE-MIT)

at your option.

## Credits

Extracted from the [EdgeCode](https://github.com/raphaelmansuy/edgecode) project, a Rust coding agent with OODA loop decision framework.