LLM Observatory Rust SDK
Production-ready Rust SDK for LLM Observatory with trait-based instrumentation, automatic cost tracking, and OpenTelemetry integration.
Features
- Automatic Instrumentation: Built-in OpenTelemetry tracing for all LLM operations
- Cost Tracking: Real-time cost calculation based on token usage and model pricing
- Provider Support: OpenAI, Anthropic, and extensible trait system for custom providers
- Async/Await: Full async support with Tokio runtime
- Type Safety: Strong typing with comprehensive error handling
- Streaming: Support for streaming completions (where available)
- Zero Configuration: Sensible defaults with optional customization
Quick Start
Add the SDK to your Cargo.toml:
[]
= "0.1"
= { = "1.40", = ["full"] }
Basic Example
use ;
async
Installation
Prerequisites
- Rust 1.75.0 or later
- OpenTelemetry Collector (for trace export)
From Crates.io
From Source
Usage
Initialize the Observatory
The observatory manages OpenTelemetry setup and configuration:
let observatory = builder
.with_service_name
.with_otlp_endpoint
.with_environment
.with_sampling_rate // Sample 100% of traces
.with_attribute
.build?;
Create an Instrumented Client
OpenAI
let client = new
.with_observatory;
Custom Configuration
use OpenAIConfig;
let config = new
.with_base_url
.with_timeout
.with_organization;
let client = with_config
.with_observatory;
Make Requests
Simple Request
let request = new
.with_system
.with_user;
let response = client.chat_completion.await?;
Advanced Request
let request = new
.with_message
.with_message
.with_message
.with_message
.with_temperature
.with_max_tokens
.with_top_p
.with_frequency_penalty
.with_user_id
.with_metadata
.with_metadata;
let response = client.chat_completion.await?;
Cost Tracking
Calculate Costs
use ;
// Calculate actual cost
let usage = new;
let cost = calculate_cost?;
println!;
// Estimate cost before request
let estimated_cost = estimate_cost?;
println!;
Track Cumulative Costs
use CostTracker;
let mut tracker = new;
// Make multiple requests
for request in requests
// View statistics
println!;
println!;
println!;
Error Handling
use Error;
match client.chat_completion.await
Custom Attributes & Metadata
// Add custom attributes to the observatory
let observatory = builder
.with_service_name
.with_attribute
.with_attribute
.build?;
// Add metadata to requests
let request = new
.with_user
.with_user_id
.with_metadata
.with_metadata;
let response = client.chat_completion.await?;
// Metadata is preserved in the response
if let Some = response.metadata.get
Architecture
The SDK is built around several core concepts:
LLMObservatory
Central manager for OpenTelemetry setup and tracer management. Handles:
- OTLP exporter configuration
- Resource attributes
- Sampling strategies
- Tracer lifecycle
InstrumentedLLM Trait
Provider-agnostic trait for instrumented LLM clients:
Automatic Instrumentation
Every LLM call automatically:
- Creates an OpenTelemetry span
- Records request parameters
- Tracks token usage
- Calculates costs
- Measures latency
- Records errors
OpenTelemetry Integration
The SDK follows OpenTelemetry GenAI Semantic Conventions:
Span Attributes
gen_ai.system: Provider name (e.g., "openai")gen_ai.request.model: Model identifiergen_ai.request.temperature: Temperature settinggen_ai.request.max_tokens: Max tokens settinggen_ai.usage.prompt_tokens: Prompt token countgen_ai.usage.completion_tokens: Completion token countgen_ai.cost.usd: Cost in USD
Metrics
- Token usage per request
- Cost per request
- Latency (total and TTFT)
- Error rates by type
Examples
The SDK includes comprehensive examples:
# Basic usage
# Streaming completions
# Custom attributes
# Error handling
# Cost tracking
Testing
Run unit tests:
Run integration tests (requires API keys):
Configuration
Environment Variables
OPENAI_API_KEY: OpenAI API keyOTEL_EXPORTER_OTLP_ENDPOINT: OTLP endpoint (default: http://localhost:4317)OTEL_SERVICE_NAME: Service name for tracing
Sampling
Control trace sampling to reduce overhead:
let observatory = builder
.with_service_name
.with_sampling_rate // Sample 10% of traces
.build?;
Performance
The SDK is designed for production use with minimal overhead:
- Async/await for non-blocking I/O
- Connection pooling via reqwest
- Efficient span creation
- Batched OTLP exports
Roadmap
- Streaming support for OpenAI
- Anthropic provider implementation
- Google Gemini provider
- Request retries with exponential backoff
- Circuit breaker pattern
- Rate limiting
- Response caching
- Token counting utilities
- Prompt template support
Contributing
Contributions are welcome! See CONTRIBUTING.md for guidelines.
License
Licensed under the Apache License, Version 2.0. See LICENSE for details.
Support
- Documentation: https://docs.llm-observatory.io
- Issues: https://github.com/llm-observatory/llm-observatory/issues
- Discussions: https://github.com/llm-observatory/llm-observatory/discussions
Acknowledgments
Built with: