openmodex 0.1.1

# OpenModex Rust SDK

The official Rust SDK for the [OpenModex](https://openmodex.com) AI Gateway API. Access 100+ LLM models from OpenAI, Anthropic, Google, DeepSeek, Mistral, and Qwen through a single unified API with intelligent routing, automatic fallbacks, and built-in cost tracking.

## Features

- **Unified API** -- One client for all major LLM providers
- **Smart Routing** -- Automatic model selection optimized for cost, latency, or quality
- **Client-Side Fallbacks** -- Automatic retry with backup models on failure
- **Streaming** -- First-class SSE streaming via `futures::Stream`
- **Async/Await** -- Built on `tokio` + `reqwest` for high-performance async I/O
- **Type Safe** -- Strongly typed request/response structs with `serde`
- **Builder Pattern** -- Ergonomic request construction
- **Automatic Retries** -- Exponential backoff on 429/5xx errors
- **OpenAI Compatible** -- Drop-in replacement by changing `base_url`

## Requirements

- Rust 1.70+ (edition 2021)

## Installation

Add to your `Cargo.toml`:

```toml
[dependencies]
openmodex = "0.1"
tokio = { version = "1", features = ["full"] }
futures = "0.3"  # Only needed for streaming
```

## Quick Start

```rust
use openmodex::{OpenModex, ChatCompletionRequest, ChatMessage};

#[tokio::main]
async fn main() -> Result<(), openmodex::Error> {
    let client = OpenModex::new("omx_sk_...")?;

    let response = client.chat().completions().create(
        ChatCompletionRequest::new("gpt-4o")
            .message(ChatMessage::user("What is OpenModex?"))
    ).await?;

    let content = response.choices[0]
        .message.as_ref()
        .and_then(|m| m.content.as_deref())
        .unwrap_or("");
    println!("{content}");

    Ok(())
}
```

Or use the `OPENMODEX_API_KEY` environment variable:

```rust
let client = OpenModex::from_env()?;
```

## Usage

### Chat Completions

```rust
let response = client.chat().completions().create(
    ChatCompletionRequest::new("claude-3-5-sonnet")
        .message(ChatMessage::system("You are a helpful assistant."))
        .message(ChatMessage::user("Explain quantum computing in simple terms."))
        .temperature(0.7)
        .max_tokens(1000)
).await?;

println!("{}", response.choices[0]
    .message.as_ref()
    .and_then(|m| m.content.as_deref())
    .unwrap_or(""));
```

### Streaming

```rust
use futures::StreamExt;

let mut stream = client.chat().completions().create_stream(
    ChatCompletionRequest::new("gpt-4o")
        .message(ChatMessage::user("Write a short story."))
        .temperature(0.9)
        .max_tokens(512)
).await?;

while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if let Some(content) = chunk.choices.first()
        .and_then(|c| c.delta.content.as_ref())
    {
        print!("{content}");
    }
}
```

### Smart Routing (OpenModex Extension)

Let the gateway pick the best model or optimize for cost/latency:

```rust
use openmodex::RoutingConfig;

let response = client.chat().completions().create(
    ChatCompletionRequest::new("gpt-4o")
        .message(ChatMessage::user("Hello!"))
        .routing(RoutingConfig {
            strategy: Some("cost_optimized".into()),
            fallback: Some(vec!["claude-3-5-sonnet".into()]),
            allow_upgrade: Some(true),
        })
).await?;
```

### OpenModex Metadata

Every response includes OpenModex-specific metadata:

```rust
let response = client.chat().completions().create(
    ChatCompletionRequest::new("gpt-4o")
        .message(ChatMessage::user("Hello!"))
).await?;

if let Some(meta) = &response.openmodex {
    println!("Provider: {}", meta.provider);
    println!("Model used: {}", meta.model_used);
    println!("Cache hit: {}", meta.cache_hit);
    println!("Routing: {}", meta.routing_strategy);
    println!("Latency: {}ms", meta.latency_ms);
    println!("Request ID: {}", meta.request_id);
}
```

### Cache Control

```rust
use openmodex::CacheConfig;

let response = client.chat().completions().create(
    ChatCompletionRequest::new("gpt-4o")
        .message(ChatMessage::user("What is 2+2?"))
        .cache(CacheConfig {
            enabled: Some(true),
            ttl: Some(3600),
        })
).await?;
```

### Client-Side Fallbacks

Automatically retry with backup models on failure:

```rust
let client = OpenModex::builder()
    .api_key("omx_sk_...")
    .fallback_models(vec![
        "gpt-4o".into(),
        "claude-3-5-sonnet".into(),
        "gemini-1.5-pro".into(),
    ])
    .build()?;

// If gpt-4o fails (5xx/timeout), automatically tries the next model
let response = client.chat().completions().create(
    ChatCompletionRequest::new("gpt-4o")
        .message(ChatMessage::user("Hello!"))
).await?;
```

### Embeddings

```rust
use openmodex::EmbeddingRequest;

let response = client.embeddings().create(
    EmbeddingRequest::new(
        "text-embedding-3-small",
        "The quick brown fox jumps over the lazy dog.",
    )
).await?;

println!("Dimensions: {}", response.data[0].embedding.len());
```

### Models

```rust
// List all available models
let models = client.models().list().await?;
for m in &models.data {
    println!("{} ({})", m.id, m.provider);
}

// Get a specific model
let model = client.models().get("openai/gpt-4o").await?;
println!("{}: {}", model.name, model.description);

// Compare models side by side
let comparison = client.models().compare(&["openai/gpt-4o", "anthropic/claude-3-5-sonnet"]).await?;
if let Some(highlights) = &comparison.highlights {
    println!("Cheapest: {}", highlights.cheapest);
    println!("Best quality: {}", highlights.best_quality);
}
```

### Legacy Completions

```rust
use openmodex::CompletionRequest;

let response = client.completions().create(
    CompletionRequest::new("gpt-3.5-turbo-instruct", "Once upon a time")
        .max_tokens(100)
).await?;

println!("{}", response.choices[0].text);
```

### Error Handling

```rust
use openmodex::{Error, ApiError};

match client.chat().completions().create(req).await {
    Ok(response) => println!("{:?}", response),
    Err(Error::Api(e)) => {
        println!("API error: {} (status: {})", e.message, e.status_code);
        if e.is_rate_limited() {
            println!("Rate limited -- back off and retry");
        }
        if e.is_auth_error() {
            println!("Check your API key");
        }
    }
    Err(e) => println!("Other error: {e}"),
}
```

## Configuration

| Method | Description | Default |
|--------|-------------|---------|
| `.api_key(key)` | Your OpenModex API key | `OPENMODEX_API_KEY` env var |
| `.base_url(url)` | API base URL | `https://api.openmodex.com/v1` |
| `.timeout(duration)` | Request timeout | `30s` |
| `.max_retries(n)` | Max retry attempts on transient errors | `2` |
| `.default_model(model)` | Default model when none specified | `None` |
| `.fallback_models(models)` | Ordered fallback model chain | `[]` |
| `.default_headers(headers)` | Headers sent with every request | `{}` |

All options are set via `ClientBuilder`:

```rust
use std::time::Duration;

let client = OpenModex::builder()
    .api_key("omx_sk_...")
    .base_url("https://api.openmodex.com/v1")
    .timeout(Duration::from_secs(60))
    .max_retries(3)
    .default_model("gpt-4o")
    .fallback_models(vec!["claude-3-5-sonnet".into()])
    .build()?;
```

## OpenAI SDK Compatibility

OpenModex Gateway supports drop-in compatibility. If you are already using an OpenAI-compatible Rust crate, you can route through OpenModex by changing just the base URL:

```rust
// With the `async-openai` crate
let config = async_openai::config::OpenAIConfig::new()
    .with_api_base("https://api.openmodex.com/v1")
    .with_api_key("omx_sk_live_...");

let client = async_openai::Client::with_config(config);
```

## Examples

See the [examples/](examples/) directory for runnable examples:

```bash
# Set your API key
export OPENMODEX_API_KEY="omx_sk_..."

# Run examples
cargo run --example quickstart
cargo run --example streaming
cargo run --example models
cargo run --example fallback
```

- [quickstart](examples/quickstart.rs) -- Basic chat completion
- [streaming](examples/streaming.rs) -- SSE streaming
- [models](examples/models.rs) -- List, retrieve, and compare models
- [fallback](examples/fallback.rs) -- Client-side fallback chain

## License

[MIT](LICENSE)