Crate llm_latency_lens_providers

Expand description

Provider adapters for LLM Latency Lens

This crate provides production-ready adapters for various LLM providers, enabling high-precision latency measurements and streaming token analysis.

§Features

OpenAI: Full implementation with GPT-4, GPT-4o, and GPT-3.5 support
Anthropic: Complete Claude integration with extended thinking support
Google: Stub implementation for Gemini (coming soon)
Streaming: Server-Sent Events (SSE) with fine-grained token timing
Retries: Automatic retry logic with exponential backoff
Cost Calculation: Accurate pricing for all supported models

§Example

use llm_latency_lens_providers::{
    openai::OpenAIProvider,
    traits::{Provider, StreamingRequest, MessageRole},
};
use llm_latency_lens_core::TimingEngine;

// Create provider
let provider = OpenAIProvider::new("sk-...");

// Create timing engine
let timing = TimingEngine::new();

// Build request
let request = StreamingRequest::builder()
    .model("gpt-4o")
    .message(MessageRole::User, "Explain quantum computing")
    .max_tokens(500)
    .temperature(0.7)
    .build();

// Stream response
let mut response = provider.stream(request, &timing).await?;

// Process tokens
use futures::StreamExt;
while let Some(token) = response.token_stream.next().await {
    let event = token?;
    println!("Token {}: {:?} (latency: {:?})",
        event.sequence,
        event.content,
        event.inter_token_latency
    );
}

§Provider Implementations

§OpenAI

The OpenAI provider supports all GPT models with comprehensive streaming and timing capabilities:

use llm_latency_lens_providers::openai::OpenAIProvider;

let provider = OpenAIProvider::builder()
    .api_key("sk-...")
    .organization("org-...")  // Optional
    .max_retries(3)
    .build();

§Anthropic

The Anthropic provider supports all Claude models including extended thinking mode:

use llm_latency_lens_providers::anthropic::AnthropicProvider;

let provider = AnthropicProvider::builder()
    .api_key("sk-ant-...")
    .api_version("2023-06-01")
    .max_retries(3)
    .build();

§Google

The Google provider is currently a stub. Full implementation coming soon:

use llm_latency_lens_providers::google::GoogleProvider;

let provider = GoogleProvider::new("AIza...");
// Note: stream() will return an error until implemented

§Error Handling

All providers use a comprehensive error type that distinguishes between retryable and non-retryable errors:

use llm_latency_lens_providers::error::ProviderError;

match provider.stream(request, &timing).await {
    Ok(response) => {
        // Process response
    }
    Err(e) => {
        if e.is_retryable() {
            eprintln!("Retryable error: {}", e);
            if let Some(delay) = e.retry_delay() {
                eprintln!("Retry after {} seconds", delay);
            }
        } else {
            eprintln!("Fatal error: {}", e);
        }
    }
}

§Timing Measurements

All providers integrate with the core timing engine to provide nanosecond-precision measurements:

TTFT (Time to First Token): Critical for perceived responsiveness
Inter-token latency: Consistency of token generation
Total generation time: Overall performance
Network timing: DNS, TLS, connection establishment

§Cost Calculation

Each provider implements accurate cost calculation based on current pricing (as of 2024):

use llm_latency_lens_providers::openai::OpenAIProvider;
use llm_latency_lens_providers::traits::Provider;

let provider = OpenAIProvider::new("sk-...");

// Calculate cost for GPT-4o
let cost = provider.calculate_cost("gpt-4o", 1000, 2000);
if let Some(usd) = cost {
    println!("Estimated cost: ${:.6}", usd);
}

Re-exports§

pub use error::ProviderError;
pub use error::Result;
pub use traits::CompletionResult;
pub use traits::Message;
pub use traits::MessageRole;
pub use traits::Provider;
pub use traits::ResponseMetadata;
pub use traits::StreamingRequest;
pub use traits::StreamingResponse;
pub use anthropic::AnthropicProvider;
pub use google::GoogleProvider;
pub use openai::OpenAIProvider;

Modules§

anthropic: Anthropic (Claude) provider implementation
error: Error types for LLM provider adapters
google: Google (Gemini) provider stub
openai: OpenAI provider implementation
traits: Provider trait definitions

Constants§

VERSION: Version of the providers crate

Functions§

create_provider: Create a provider from a string identifier
supported_providers: List all supported providers