Siumai - Unified LLM Interface Library for Rust

Siumai (烧卖) is a unified LLM interface library for Rust that provides a consistent API across multiple AI providers. It features capability-based trait separation, type-safe parameter handling, and comprehensive streaming support.

🎯 Two Ways to Use Siumai

Siumai offers two distinct approaches to fit your needs:

Provider - For provider-specific clients with access to all features
Siumai::builder() - For unified interface with provider-agnostic code

Choose Provider when you need provider-specific features, or Siumai::builder() when you want maximum portability.

🌟 Features

🔌 Multi-Provider Support: OpenAI, Anthropic Claude, Google Gemini, Ollama, and custom providers
🎯 Capability-Based Design: Separate traits for chat, audio, vision, tools, and embeddings
🔧 Builder Pattern: Fluent API with method chaining for easy configuration
🌊 Streaming Support: Full streaming capabilities with event processing
🛡️ Type Safety: Leverages Rust's type system for compile-time safety
🔄 Parameter Mapping: Automatic translation between common and provider-specific parameters
📦 HTTP Customization: Support for custom reqwest clients and HTTP configurations
🎨 Multimodal: Support for text, images, and audio content
⚡ Async/Await: Built on tokio for high-performance async operations
🔁 Retry Mechanisms: Intelligent retry with exponential backoff and jitter
🛡️ Error Handling: Advanced error classification with recovery suggestions
✅ Parameter Validation: Cross-provider parameter validation and optimization

🚀 Quick Start

Add Siumai to your Cargo.toml:

[dependencies]
siumai = "0.7.0"
tokio = { version = "1.0", features = ["full"] }

Provider-Specific Clients

Use Provider when you need access to provider-specific features:

use siumai::prelude::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Get a client specifically for OpenAI
    let openai_client = Provider::openai()
        .api_key("your-openai-key")
        .model("gpt-4")
        .temperature(0.7)
        .build()
        .await?;

    // You can now call both standard and OpenAI-specific methods
    let response = openai_client.chat(vec![user!("Hello!")]).await?;
    // let assistant = openai_client.create_assistant(...).await?; // Example of specific feature

    println!("OpenAI says: {}", response.text().unwrap_or_default());
    Ok(())
}

Unified Interface

Use Siumai::builder() when you want provider-agnostic code:

use siumai::prelude::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Build a unified client, backed by Anthropic
    let client = Siumai::builder()
        .anthropic()
        .api_key("your-anthropic-key")
        .model("claude-3-sonnet-20240229")
        .build()
        .await?;

    // Your code uses the standard Siumai interface
    let request = vec![user!("What is the capital of France?")];
    let response = client.chat(request).await?;

    // If you decide to switch to OpenAI, you only change the builder.
    // The `.chat(request)` call remains identical.
    println!("The unified client says: {}", response.text().unwrap_or_default());
    Ok(())
}

Multimodal Messages

use siumai::prelude::*;

// Create a message with text and image - use builder for complex messages
let message = ChatMessage::user("What do you see in this image?")
    .with_image("https://example.com/image.jpg".to_string(), Some("high".to_string()))
    .build();

let request = ChatRequest::builder()
    .message(message)
    .build();

Streaming

use siumai::prelude::*;
use futures::StreamExt;

// Create a streaming request
let stream = client.chat_stream(request).await?;

// Process stream events
let response = collect_stream_response(stream).await?;
println!("Final response: {}", response.text().unwrap_or(""));

🏗️ Architecture

Siumai uses a capability-based architecture that separates different AI functionalities:

Core Traits

ChatCapability: Basic chat functionality
AudioCapability: Text-to-speech and speech-to-text
VisionCapability: Image analysis and generation
ToolCapability: Function calling and tool usage
EmbeddingCapability: Text embeddings

Provider-Specific Traits

OpenAiCapability: OpenAI-specific features (structured output, batch processing)
AnthropicCapability: Anthropic-specific features (prompt caching, thinking mode)
GeminiCapability: Google Gemini-specific features (search integration, code execution)

📚 Examples

Different Providers

Provider-Specific Clients

// OpenAI - with provider-specific features
let openai_client = Provider::openai()
    .api_key("sk-...")
    .model("gpt-4")
    .temperature(0.7)
    .build()
    .await?;

// Anthropic - with provider-specific features
let anthropic_client = Provider::anthropic()
    .api_key("sk-ant-...")
    .model("claude-3-5-sonnet-20241022")
    .temperature(0.8)
    .build()
    .await?;

// Ollama - with provider-specific features
let ollama_client = Provider::ollama()
    .base_url("http://localhost:11434")
    .model("llama3.2:latest")
    .temperature(0.7)
    .build()
    .await?;

Unified Interface

// OpenAI through unified interface
let openai_unified = Siumai::builder()
    .openai()
    .api_key("sk-...")
    .model("gpt-4")
    .temperature(0.7)
    .build()
    .await?;

// Anthropic through unified interface
let anthropic_unified = Siumai::builder()
    .anthropic()
    .api_key("sk-ant-...")
    .model("claude-3-5-sonnet-20241022")
    .temperature(0.8)
    .build()
    .await?;

// Ollama through unified interface
let ollama_unified = Siumai::builder()
    .ollama()
    .base_url("http://localhost:11434")
    .model("llama3.2:latest")
    .temperature(0.7)
    .build()
    .await?;

Custom HTTP Client

use std::time::Duration;

let custom_client = reqwest::Client::builder()
    .timeout(Duration::from_secs(60))
    .user_agent("my-app/1.0")
    .build()?;

// With provider-specific client
let client = Provider::openai()
    .api_key("your-key")
    .model("gpt-4")
    .build()
    .await?;

// With unified interface
let unified_client = Siumai::builder()
    .openai()
    .api_key("your-key")
    .model("gpt-4")
    .build()
    .await?;

Provider-Specific Features

// OpenAI with structured output (provider-specific client)
let openai_client = Provider::openai()
    .api_key("your-key")
    .model("gpt-4")
    .response_format(ResponseFormat::JsonObject)
    .frequency_penalty(0.1)
    .build()
    .await?;

// Anthropic with caching (provider-specific client)
let anthropic_client = Provider::anthropic()
    .api_key("your-key")
    .model("claude-3-5-sonnet-20241022")
    .cache_control(CacheControl::Ephemeral)
    .thinking_budget(1000)
    .build()
    .await?;

// Ollama with local model management (provider-specific client)
let ollama_client = Provider::ollama()
    .base_url("http://localhost:11434")
    .model("llama3.2:latest")
    .keep_alive("10m")
    .num_ctx(4096)
    .num_gpu(1)
    .build()
    .await?;

// Unified interface (common parameters only)
let unified_client = Siumai::builder()
    .openai()
    .api_key("your-key")
    .model("gpt-4")
    .temperature(0.7)
    .max_tokens(1000)
    .build()
    .await?;

Advanced Features

Parameter Validation and Optimization

use siumai::params::EnhancedParameterValidator;

let params = CommonParams {
    model: "gpt-4".to_string(),
    temperature: Some(0.7),
    max_tokens: Some(1000),
    // ... other parameters
};

// Validate parameters for a specific provider
let validation_result = EnhancedParameterValidator::validate_for_provider(
    &params,
    &ProviderType::OpenAi,
)?;

// Optimize parameters for better performance
let mut optimized_params = params.clone();
let optimization_report = EnhancedParameterValidator::optimize_for_provider(
    &mut optimized_params,
    &ProviderType::OpenAi,
);

Retry Mechanisms

use siumai::retry::{RetryPolicy, RetryExecutor};

let policy = RetryPolicy::new()
    .with_max_attempts(3)
    .with_initial_delay(Duration::from_millis(1000))
    .with_backoff_multiplier(2.0);

let executor = RetryExecutor::new(policy);

let result = executor.execute(|| async {
    client.chat_with_tools(messages.clone(), None).await
}).await?;

Error Handling and Classification

use siumai::error_handling::{ErrorClassifier, ErrorContext};

match client.chat_with_tools(messages, None).await {
    Ok(response) => println!("Success: {}", response.text().unwrap_or("")),
    Err(error) => {
        let context = ErrorContext::default();
        let classified = ErrorClassifier::classify(&error, context);

        println!("Error category: {:?}", classified.category);
        println!("Severity: {:?}", classified.severity);
        println!("Recovery suggestions: {:?}", classified.recovery_suggestions);
    }
}

🔧 Configuration

Common Parameters

All providers support these common parameters:

model: Model name
temperature: Randomness (0.0-2.0)
max_tokens: Maximum output tokens
top_p: Nucleus sampling parameter
stop_sequences: Stop generation sequences
seed: Random seed for reproducibility

Provider-Specific Parameters

Each provider can have additional parameters:

OpenAI:

response_format: Output format control
tool_choice: Tool selection strategy
frequency_penalty: Frequency penalty
presence_penalty: Presence penalty

Anthropic:

cache_control: Prompt caching settings
thinking_budget: Thinking process budget
system: System message handling

Ollama:

keep_alive: Model memory duration
raw: Bypass templating
format: Output format (json, etc.)
numa: NUMA support
num_ctx: Context window size
num_gpu: GPU layers to use

Ollama Local AI Examples

Basic Chat with Local Model

use siumai::prelude::*;

// Connect to local Ollama instance
let client = Provider::ollama()
    .base_url("http://localhost:11434")
    .model("llama3.2:latest")
    .temperature(0.7)
    .build()
    .await?;

let messages = vec![user!("Explain quantum computing in simple terms")];
let response = client.chat_with_tools(messages, None).await?;
println!("Ollama says: {}", response.content);

Advanced Ollama Configuration

use siumai::providers::ollama::{OllamaClient, OllamaConfig};

let config = OllamaConfig::builder()
    .base_url("http://localhost:11434")
    .model("llama3.2:latest")
    .keep_alive("10m")           // Keep model in memory
    .num_ctx(4096)              // Context window
    .num_gpu(1)                 // Use GPU acceleration
    .numa(true)                 // Enable NUMA
    .think(true)                // Enable thinking mode for thinking models
    .option("temperature", serde_json::Value::Number(
        serde_json::Number::from_f64(0.8).unwrap()
    ))
    .build()?;

let client = OllamaClient::new_with_config(config);

// Generate text with streaming
let mut stream = client.generate_stream("Write a haiku about AI".to_string()).await?;
while let Some(event) = stream.next().await {
    // Process streaming response
}

Thinking Models with Ollama

use siumai::prelude::*;

// Use thinking models like DeepSeek-R1
let client = LlmBuilder::new()
    .ollama()
    .base_url("http://localhost:11434")
    .model("deepseek-r1:latest")
    .think(true)                // Enable thinking mode
    .temperature(0.7)
    .build()
    .await?;

let messages = vec![
    user!("Solve this step by step: What is 15% of 240?")
];

let response = client.chat(messages).await?;

// Access the model's thinking process
if let Some(thinking) = &response.thinking {
    println!("🧠 Model's reasoning: {}", thinking);
}

// Get the final answer
if let Some(answer) = response.content_text() {
    println!("📝 Final answer: {}", answer);
}

OpenAI API Feature Examples

Responses API (OpenAI-Specific)

OpenAI's Responses API provides stateful conversations, background processing, and built-in tools:

use siumai::providers::openai::responses::{OpenAiResponses, ResponsesApiCapability};
use siumai::providers::openai::config::OpenAiConfig;
use siumai::types::OpenAiBuiltInTool;
use siumai::prelude::*;

// Create Responses API client with built-in tools
let config = OpenAiConfig::new("your-api-key")
    .with_model("gpt-4o")
    .with_responses_api(true)
    .with_built_in_tool(OpenAiBuiltInTool::WebSearch);

let client = OpenAiResponses::new(reqwest::Client::new(), config);

// Basic chat with built-in tools
let messages = vec![user!("What's the latest news about AI?")];
let response = client.chat_with_tools(messages, None).await?;
println!("Response: {}", response.content.all_text());

// Background processing for complex tasks
let complex_messages = vec![user!("Research quantum computing and write a summary")];
let background_response = client
    .create_response_background(
        complex_messages,
        None,
        Some(vec![OpenAiBuiltInTool::WebSearch]),
        None,
    )
    .await?;

// Check if background task is ready
let is_ready = client.is_response_ready(&background_response.id).await?;
if is_ready {
    let final_response = client.get_response(&background_response.id).await?;
    println!("Background result: {}", final_response.content.all_text());
}

Text Embedding

use siumai::prelude::*;

// Unified interface - works with any provider that supports embeddings
let client = Siumai::builder()
    .openai()
    .api_key("your-api-key")
    .model("text-embedding-3-small")
    .build()
    .await?;

let texts = vec!["Hello, world!".to_string()];
let response = client.embed(texts).await?;
println!("Got {} embeddings with {} dimensions",
         response.embeddings.len(),
         response.embeddings[0].len());

// Provider-specific interface for advanced features
let embeddings_client = Provider::openai()
    .api_key("your-api-key")
    .build()
    .await?;

let response = embeddings_client.embed(texts).await?;

Text-to-Speech

use siumai::providers::openai::{OpenAiConfig, OpenAiAudio};
use siumai::traits::AudioCapability;
use siumai::types::TtsRequest;

let config = OpenAiConfig::new("your-api-key");
let client = OpenAiAudio::new(config, reqwest::Client::new());

let request = TtsRequest {
    text: "Hello, world!".to_string(),
    voice: Some("alloy".to_string()),
    format: Some("mp3".to_string()),
    speed: Some(1.0),
    model: Some("tts-1".to_string()),
    extra_params: std::collections::HashMap::new(),
};

let response = client.text_to_speech(request).await?;
std::fs::write("output.mp3", response.audio_data)?;

Image Generation

use siumai::providers::openai::{OpenAiConfig, OpenAiImages};
use siumai::traits::ImageGenerationCapability;
use siumai::types::ImageGenerationRequest;

let config = OpenAiConfig::new("your-api-key");
let client = OpenAiImages::new(config, reqwest::Client::new());

let request = ImageGenerationRequest {
    prompt: "A beautiful sunset".to_string(),
    model: Some("dall-e-3".to_string()),
    size: Some("1024x1024".to_string()),
    count: 1,
    ..Default::default()
};

let response = client.generate_images(request).await?;
for image in response.images {
    if let Some(url) = image.url {
        println!("Image URL: {}", url);
    }
}

🧪 Testing

Run the test suite:

cargo test

Run integration tests:

cargo test --test integration_tests

Run examples:

cargo run --example basic_usage

📖 Documentation

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

📄 License

This project is licensed under either of

Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

🙏 Acknowledgments

Inspired by the need for a unified LLM interface in Rust
Built with love for the Rust community
Special thanks to all contributors

Made with ❤️ by the YumchaLabs team

siumai 0.7.0

Siumai - Unified LLM Interface Library for Rust

🎯 Two Ways to Use Siumai

🌟 Features

🚀 Quick Start

Provider-Specific Clients

Unified Interface

Multimodal Messages

Streaming

🏗️ Architecture

Core Traits

Provider-Specific Traits

📚 Examples

Different Providers

Provider-Specific Clients

Unified Interface

Custom HTTP Client

Provider-Specific Features

Advanced Features

Parameter Validation and Optimization

Retry Mechanisms

Error Handling and Classification

🔧 Configuration

Common Parameters

Provider-Specific Parameters

Ollama Local AI Examples

Basic Chat with Local Model

Advanced Ollama Configuration

Thinking Models with Ollama

OpenAI API Feature Examples

Responses API (OpenAI-Specific)

Text Embedding

Text-to-Speech

Image Generation

🧪 Testing

📖 Documentation

🤝 Contributing

📄 License

🙏 Acknowledgments