Siumai - Unified LLM Interface Library for Rust
Siumai (็งๅ) is a unified LLM interface library for Rust that provides a consistent API across multiple AI providers. It features capability-based trait separation, type-safe parameter handling, and comprehensive streaming support.
๐ฏ Two Ways to Use Siumai
Siumai offers two distinct approaches to fit your needs:
Provider- For provider-specific clients with access to all featuresSiumai::builder()- For unified interface with provider-agnostic code
Choose Provider when you need provider-specific features, or Siumai::builder() when you want maximum portability.
๐ Features
- ๐ Multi-Provider Support: OpenAI, Anthropic Claude, Google Gemini, Ollama, and custom providers
- ๐ฏ Capability-Based Design: Separate traits for chat, audio, vision, tools, and embeddings
- ๐ง Builder Pattern: Fluent API with method chaining for easy configuration
- ๐ Streaming Support: Full streaming capabilities with event processing
- ๐ก๏ธ Type Safety: Leverages Rust's type system for compile-time safety
- ๐ Parameter Mapping: Automatic translation between common and provider-specific parameters
- ๐ฆ HTTP Customization: Support for custom reqwest clients and HTTP configurations
- ๐จ Multimodal: Support for text, images, and audio content
- โก Async/Await: Built on tokio for high-performance async operations
- ๐ Retry Mechanisms: Intelligent retry with exponential backoff and jitter
- ๐ก๏ธ Error Handling: Advanced error classification with recovery suggestions
- โ Parameter Validation: Cross-provider parameter validation and optimization
๐ Quick Start
Add Siumai to your Cargo.toml:
[]
= "0.7.0"
= { = "1.0", = ["full"] }
Provider-Specific Clients
Use Provider when you need access to provider-specific features:
use *;
async
Unified Interface
Use Siumai::builder() when you want provider-agnostic code:
use *;
async
Multimodal Messages
use *;
// Create a message with text and image - use builder for complex messages
let message = user
.with_image
.build;
let request = builder
.message
.build;
Streaming
use *;
use StreamExt;
// Create a streaming request
let stream = client.chat_stream.await?;
// Process stream events
let response = collect_stream_response.await?;
println!;
๐๏ธ Architecture
Siumai uses a capability-based architecture that separates different AI functionalities:
Core Traits
ChatCapability: Basic chat functionalityAudioCapability: Text-to-speech and speech-to-textVisionCapability: Image analysis and generationToolCapability: Function calling and tool usageEmbeddingCapability: Text embeddings
Provider-Specific Traits
OpenAiCapability: OpenAI-specific features (structured output, batch processing)AnthropicCapability: Anthropic-specific features (prompt caching, thinking mode)GeminiCapability: Google Gemini-specific features (search integration, code execution)
๐ Examples
Different Providers
Provider-Specific Clients
// OpenAI - with provider-specific features
let openai_client = openai
.api_key
.model
.temperature
.build
.await?;
// Anthropic - with provider-specific features
let anthropic_client = anthropic
.api_key
.model
.temperature
.build
.await?;
// Ollama - with provider-specific features
let ollama_client = ollama
.base_url
.model
.temperature
.build
.await?;
Unified Interface
// OpenAI through unified interface
let openai_unified = builder
.openai
.api_key
.model
.temperature
.build
.await?;
// Anthropic through unified interface
let anthropic_unified = builder
.anthropic
.api_key
.model
.temperature
.build
.await?;
// Ollama through unified interface
let ollama_unified = builder
.ollama
.base_url
.model
.temperature
.build
.await?;
Custom HTTP Client
use Duration;
let custom_client = builder
.timeout
.user_agent
.build?;
// With provider-specific client
let client = openai
.api_key
.model
.build
.await?;
// With unified interface
let unified_client = builder
.openai
.api_key
.model
.build
.await?;
Provider-Specific Features
// OpenAI with structured output (provider-specific client)
let openai_client = openai
.api_key
.model
.response_format
.frequency_penalty
.build
.await?;
// Anthropic with caching (provider-specific client)
let anthropic_client = anthropic
.api_key
.model
.cache_control
.thinking_budget
.build
.await?;
// Ollama with local model management (provider-specific client)
let ollama_client = ollama
.base_url
.model
.keep_alive
.num_ctx
.num_gpu
.build
.await?;
// Unified interface (common parameters only)
let unified_client = builder
.openai
.api_key
.model
.temperature
.max_tokens
.build
.await?;
Advanced Features
Parameter Validation and Optimization
use EnhancedParameterValidator;
let params = CommonParams ;
// Validate parameters for a specific provider
let validation_result = validate_for_provider?;
// Optimize parameters for better performance
let mut optimized_params = params.clone;
let optimization_report = optimize_for_provider;
Retry Mechanisms
use ;
let policy = new
.with_max_attempts
.with_initial_delay
.with_backoff_multiplier;
let executor = new;
let result = executor.execute.await?;
Error Handling and Classification
use ;
match client.chat_with_tools.await
๐ง Configuration
Common Parameters
All providers support these common parameters:
model: Model nametemperature: Randomness (0.0-2.0)max_tokens: Maximum output tokenstop_p: Nucleus sampling parameterstop_sequences: Stop generation sequencesseed: Random seed for reproducibility
Provider-Specific Parameters
Each provider can have additional parameters:
OpenAI:
response_format: Output format controltool_choice: Tool selection strategyfrequency_penalty: Frequency penaltypresence_penalty: Presence penalty
Anthropic:
cache_control: Prompt caching settingsthinking_budget: Thinking process budgetsystem: System message handling
Ollama:
keep_alive: Model memory durationraw: Bypass templatingformat: Output format (json, etc.)numa: NUMA supportnum_ctx: Context window sizenum_gpu: GPU layers to use
Ollama Local AI Examples
Basic Chat with Local Model
use *;
// Connect to local Ollama instance
let client = ollama
.base_url
.model
.temperature
.build
.await?;
let messages = vec!;
let response = client.chat_with_tools.await?;
println!;
Advanced Ollama Configuration
use ;
let config = builder
.base_url
.model
.keep_alive // Keep model in memory
.num_ctx // Context window
.num_gpu // Use GPU acceleration
.numa // Enable NUMA
.think // Enable thinking mode for thinking models
.option
.build?;
let client = new_with_config;
// Generate text with streaming
let mut stream = client.generate_stream.await?;
while let Some = stream.next.await
Thinking Models with Ollama
use *;
// Use thinking models like DeepSeek-R1
let client = new
.ollama
.base_url
.model
.think // Enable thinking mode
.temperature
.build
.await?;
let messages = vec!;
let response = client.chat.await?;
// Access the model's thinking process
if let Some = &response.thinking
// Get the final answer
if let Some = response.content_text
OpenAI API Feature Examples
Responses API (OpenAI-Specific)
OpenAI's Responses API provides stateful conversations, background processing, and built-in tools:
use ;
use OpenAiConfig;
use OpenAiBuiltInTool;
use *;
// Create Responses API client with built-in tools
let config = new
.with_model
.with_responses_api
.with_built_in_tool;
let client = new;
// Basic chat with built-in tools
let messages = vec!;
let response = client.chat_with_tools.await?;
println!;
// Background processing for complex tasks
let complex_messages = vec!;
let background_response = client
.create_response_background
.await?;
// Check if background task is ready
let is_ready = client.is_response_ready.await?;
if is_ready
Text Embedding
use *;
// Unified interface - works with any provider that supports embeddings
let client = builder
.openai
.api_key
.model
.build
.await?;
let texts = vec!;
let response = client.embed.await?;
println!;
// Provider-specific interface for advanced features
let embeddings_client = openai
.api_key
.build
.await?;
let response = embeddings_client.embed.await?;
Text-to-Speech
use ;
use AudioCapability;
use TtsRequest;
let config = new;
let client = new;
let request = TtsRequest ;
let response = client.text_to_speech.await?;
write?;
Image Generation
use ;
use ImageGenerationCapability;
use ImageGenerationRequest;
let config = new;
let client = new;
let request = ImageGenerationRequest ;
let response = client.generate_images.await?;
for image in response.images
๐งช Testing
Run the test suite:
Run integration tests:
Run examples:
๐ Documentation
๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
๐ License
This project is licensed under either of
- Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
๐ Acknowledgments
- Inspired by the need for a unified LLM interface in Rust
- Built with love for the Rust community
- Special thanks to all contributors
Made with โค๏ธ by the YumchaLabs team