llm-connector 0.4.3

Next-generation Rust library for LLM protocol abstraction. V2 architecture with 7000x+ performance boost. Supports 5 protocols (OpenAI, Anthropic, Aliyun, Zhipu, Ollama) with clean Protocol/Provider separation, type-safe interface, and universal streaming.
Documentation

llm-connector

Next-generation Rust library for LLM protocol abstraction.

Supports 5 protocols: OpenAI, Anthropic, Zhipu, Aliyun, Ollama. Clean architecture with clear Protocol/Provider separation for maximum performance and extensibility.

๐Ÿšจ Having Authentication Issues?

Test your API keys right now:

cargo run --example test_keys_yaml

This will tell you exactly what's wrong with your API keys! See Debugging & Troubleshooting for more details.

โœจ Key Features

  • 5 Protocol Support: OpenAI, Anthropic, Zhipu, Aliyun, Ollama
  • V2 Architecture: Clean Protocol/Provider separation for maximum extensibility
  • Extreme Performance: 7,000x+ faster client creation (7ยตs vs 53ms)
  • Memory Efficient: Only 16 bytes per client instance
  • Type-Safe: Full Rust type safety with Result-based error handling
  • No Hardcoded Models: Use any model name without restrictions
  • Online Model Discovery: Fetch available models dynamically from API
  • Universal Streaming: Real-time streaming with format abstraction (JSON/SSE/NDJSON)
  • Ollama Model Management: Full CRUD operations for local models
  • Unified Interface: Same API for all protocols

Quick Start

Installation

Add to your Cargo.toml:

[dependencies]
llm-connector = "0.4.0"
tokio = { version = "1", features = ["full"] }

Optional features:

# Streaming support
llm-connector = { version = "0.4.0", features = ["streaming"] }

# V1 legacy compatibility
llm-connector = { version = "0.4.0", features = ["v1-legacy"] }

# Both streaming and V1 compatibility
llm-connector = { version = "0.4.0", features = ["streaming", "v1-legacy"] }

Basic Usage

use llm_connector::{LlmClient, types::{ChatRequest, Message, Role}};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // OpenAI
    let client = LlmClient::openai("sk-...")?;

    // Anthropic Claude
    let client = LlmClient::anthropic("sk-ant-...")?;

    // Aliyun DashScope
    let client = LlmClient::aliyun("sk-...")?;

    // Zhipu GLM
    let client = LlmClient::zhipu("your-api-key")?;

    // Ollama (local, no API key needed)
    let client = LlmClient::ollama()?;

    let request = ChatRequest {
        model: "gpt-4".to_string(),
        messages: vec![Message {
            role: Role::User,
            content: "Hello!".to_string(),
            ..Default::default()
        }],
        ..Default::default()
    };

    let response = client.chat(&request).await?;
    println!("Response: {}", response.content);
    Ok(())
}

Supported Protocols

1. OpenAI Protocol

Standard OpenAI API format with multiple deployment options.

// OpenAI (default)
let client = LlmClient::openai("sk-...")?;

// Custom base URL
let client = LlmClient::openai_with_base_url("sk-...", "https://api.deepseek.com")?;

// Azure OpenAI
let client = LlmClient::azure_openai(
    "your-key",
    "https://your-resource.openai.azure.com",
    "2024-02-15-preview"
)?;

// OpenAI-compatible services
let client = LlmClient::openai_compatible("sk-...", "https://api.deepseek.com", "deepseek")?;

Features:

  • โœ… No hardcoded models - use any model name
  • โœ… Online model discovery via models()
  • โœ… Azure OpenAI support
  • โœ… Works with OpenAI-compatible providers (DeepSeek, Moonshot, etc.)

Example Models: gpt-4, gpt-4-turbo, gpt-3.5-turbo, o1-preview, o1-mini

2. Anthropic Protocol

Claude Messages API with multiple deployment options.

// Standard Anthropic API
let client = LlmClient::anthropic("sk-ant-...")?;

// Google Vertex AI
let client = LlmClient::anthropic_vertex("project-id", "us-central1", "access-token")?;

// Amazon Bedrock
let client = LlmClient::anthropic_bedrock("us-east-1", "access-key", "secret-key")?;

Models: claude-3-5-sonnet-20241022, claude-3-opus, claude-3-haiku

3. Zhipu Protocol (ChatGLM)

Supports both native and OpenAI-compatible formats.

// Native format
let client = LlmClient::zhipu("your-api-key")?;

// OpenAI-compatible format (recommended)
let client = LlmClient::zhipu_openai_compatible("your-api-key")?;

Models: glm-4, glm-4-flash, glm-4-air, glm-4-plus, glm-4x

4. Aliyun Protocol (DashScope)

Custom protocol for Qwen models with regional support.

// Default (China)
let client = LlmClient::aliyun("sk-...")?;

// International
let client = LlmClient::aliyun_international("sk-...")?;

// Private cloud
let client = LlmClient::aliyun_private("sk-...", "https://your-endpoint.com")?;

Models: qwen-turbo, qwen-plus, qwen-max

5. Ollama Protocol (Local)

Local LLM server with comprehensive model management.

// Default: localhost:11434
let client = LlmClient::ollama()?;

// Custom URL
let client = LlmClient::ollama_with_url("http://192.168.1.100:11434")?;

// With custom configuration
let client = LlmClient::ollama_with_config(
    "http://localhost:11434",
    Some(120), // timeout in seconds
    None       // proxy
)?;

Models: llama3.2, llama3.1, mistral, mixtral, qwen2.5, etc.

Features:

  • โœ… Model listing and management
  • โœ… Pull, delete, and inspect models
  • โœ… Local server support with custom URLs
  • โœ… Enhanced error handling for Ollama-specific operations
  • โœ… Direct access to Ollama-specific features

Ollama Model Management

Access Ollama-specific features through the special interface:

let client = LlmClient::ollama()?;

// Access Ollama-specific features
if let Some(ollama) = client.as_ollama() {
    // List all installed models
    let models = ollama.models().await?;
    for model in models {
        println!("Available model: {}", model);
    }

    // Pull a new model
    ollama.pull_model("llama3.2").await?;

    // Get detailed model information
    let details = ollama.show_model("llama3.2").await?;
    println!("Model format: {}", details.details.format);

    // Check if model exists
    let exists = ollama.model_exists("llama3.2").await?;
    println!("Model exists: {}", exists);

    // Delete a model
    ollama.delete_model("llama3.2").await?;
}

Supported Ollama Operations

  • List Models: models() - Get all locally installed models
  • Pull Models: pull_model(name) - Download models from registry
  • Delete Models: delete_model(name) - Remove local models
  • Show Details: show_model(name) - Get comprehensive model information
  • Check Existence: model_exists(name) - Verify if model is installed

Universal Streaming Format Support

The library provides comprehensive streaming support with universal format abstraction for maximum flexibility:

Standard OpenAI Format (Default)

use futures_util::StreamExt;
use llm_connector::{LlmClient, types::{ChatRequest, Message, Role}};

let client = LlmClient::anthropic("sk-ant-...")?;
let request = ChatRequest {
    model: "claude-3-5-sonnet-20241022".to_string(),
    messages: vec![Message {
        role: Role::User,
        content: "Hello!".to_string(),
        ..Default::default()
    }],
    max_tokens: Some(200),
    ..Default::default()
};

let mut stream = client.chat_stream(&request).await?;

while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if let Some(content) = chunk.get_content() {
        print!("{}", content);
    }
}

Pure Ollama Format for Tool Integration

For perfect compatibility with tools like Zed.dev, use the pure Ollama streaming format:

use futures_util::StreamExt;

// Use pure Ollama format (perfect for Zed.dev)
let mut stream = client.chat_stream_ollama(&request).await?;

while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    // chunk is now a pure OllamaStreamChunk
    if !chunk.message.content.is_empty() {
        print!("{}", chunk.message.content);
    }

    // Check for final chunk
    if chunk.done {
        println!("\nStreaming complete!");
        break;
    }
}

Legacy Ollama Format (Embedded)

For backward compatibility, the embedded format is still available:

use futures_util::StreamExt;

// Use embedded Ollama format (legacy)
let mut stream = client.chat_stream_ollama_embedded(&request).await?;

while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    // chunk.content contains Ollama-formatted JSON string
    if let Ok(ollama_chunk) = serde_json::from_str::<serde_json::Value>(&chunk.content) {
        if let Some(content) = ollama_chunk
            .get("message")
            .and_then(|m| m.get("content"))
            .and_then(|c| c.as_str())
        {
            print!("{}", content);
        }
    }
}

Universal Format Abstraction

For maximum flexibility, use the universal streaming interface with format abstraction:

use llm_connector::types::{StreamingConfig, StreamingFormat, StreamFormat};

// Configure both content format and output format
let config = StreamingConfig {
    format: StreamingFormat::Ollama,        // Content format (OpenAI/Ollama)
    stream_format: StreamFormat::SSE,       // Output format (JSON/SSE/NDJSON)
    include_usage: true,
    include_reasoning: false,
};

let mut stream = client.chat_stream_universal(&request, &config).await?;
while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    // chunk.to_format() returns formatted string based on stream_format
    println!("{}", chunk.to_format());

    // Or extract content directly
    if let Some(content) = chunk.extract_content() {
        print!("{}", content);
    }
}

Convenient Format Methods

For common use cases, use the convenient format-specific methods:

// Server-Sent Events format (perfect for web applications)
let mut sse_stream = client.chat_stream_sse(&request).await?;
while let Some(chunk) = sse_stream.next().await {
    let chunk = chunk?;
    println!("{}", chunk.to_format()); // "data: {...}\n\n"
}

// Newline-Delimited JSON format (perfect for data pipelines)
let mut ndjson_stream = client.chat_stream_ndjson(&request).await?;
while let Some(chunk) = ndjson_stream.next().await {
    let chunk = chunk?;
    println!("{}", chunk.to_format()); // "{...}\n"
}

Format Comparison

Format Output Example Use Case
JSON {"content":"hello"} API responses, standard JSON
SSE data: {"content":"hello"}\n\n Web real-time streaming
NDJSON {"content":"hello"}\n Log processing, data pipelines

Enhanced Anthropic Streaming Features

  • State Management: Proper handling of message_start, content_block_delta, message_delta, message_stop events
  • Event Processing: Correct parsing of complex Anthropic streaming responses
  • Usage Tracking: Real-time token usage statistics during streaming
  • Error Resilience: Robust error handling for streaming interruptions

Model Discovery

Fetch the latest available models from the API:

let client = LlmClient::openai("sk-...")?;

// Fetch models online from the API
let models = client.models().await?;
println!("Available models: {:?}", models);

Supported by:

  • โœ… OpenAI Protocol (including OpenAI-compatible providers like DeepSeek, Zhipu, Moonshot)
  • โœ… Anthropic Protocol (limited support - returns fallback endpoint)
  • โœ… Ollama Protocol (full support via /api/tags)
  • โŒ Aliyun Protocol (not supported)

Example Results:

  • DeepSeek: ["deepseek-chat", "deepseek-reasoner"]
  • Zhipu: ["glm-4.5", "glm-4.5-air", "glm-4.6"]
  • Moonshot: ["moonshot-v1-32k", "kimi-latest", ...]

Recommendation:

  • Cache models() results to avoid repeated API calls
  • For protocols that don't support model listing, you can use any model name directly in your requests

Request Examples

OpenAI / OpenAI-compatible

let request = ChatRequest {
    model: "gpt-4".to_string(),
    messages: vec![
        Message::system("You are a helpful assistant."),
        Message::user("Hello!"),
    ],
    temperature: Some(0.7),
    max_tokens: Some(100),
    ..Default::default()
};

Anthropic (requires max_tokens)

let request = ChatRequest {
    model: "claude-3-5-sonnet-20241022".to_string(),
    messages: vec![Message::user("Hello!")],
    max_tokens: Some(200), // Required for Anthropic
    ..Default::default()
};

Aliyun (DashScope)

let request = ChatRequest {
    model: "qwen-max".to_string(),
    messages: vec![Message::user("ไฝ ๅฅฝ๏ผ")],
    ..Default::default()
};

Ollama (Local)

let request = ChatRequest {
    model: "llama3.2".to_string(),
    messages: vec![Message::user("Hello!")],
    ..Default::default()
};

Ollama Streaming (GLM-4.6 via Remote Gateway)

If you expose an Ollama-compatible API while the backend actually calls Zhipu glm-4.6 (remote gateway), you do NOT need any local model installation. Just point the client to your gateway and use the model id defined by your service:

use futures_util::StreamExt;
use llm_connector::{LlmClient, types::{ChatRequest, Message}};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Point to your remote Ollama-compatible gateway (replace with your actual URL)
    let client = LlmClient::ollama(Some("https://your-ollama-gateway.example.com"));

    let request = ChatRequest {
        model: "glm-4.6".to_string(),
        messages: vec![Message::user("Briefly explain the benefits of streaming.")],
        max_tokens: Some(128),
        ..Default::default()
    };

    let mut stream = client.chat_stream(&request).await?;
    while let Some(chunk) = stream.next().await {
        let chunk = chunk?;
        if let Some(content) = chunk.get_content() {
            print!("{}", content);
        }
    }

    Ok(())
}

Run example (requires streaming feature):

cargo run --example ollama_streaming --features streaming

Note: This setup targets a remote Ollama-compatible gateway. The model id is defined by your backend (e.g. glm-4.6); no local installation is required. If your gateway uses a different identifier, replace it accordingly.

Streaming (Optional Feature)

Enable streaming in your Cargo.toml:

llm-connector = { version = "0.3.13", features = ["streaming"] }
use futures_util::StreamExt;

let mut stream = client.chat_stream(&request).await?;

while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if let Some(content) = chunk.get_content() {
        print!("{}", content);
    }
}

Error Handling

use llm_connector::error::LlmConnectorError;

match client.chat(&request).await {
    Ok(response) => {
        println!("Response: {}", response.choices[0].message.content);
    }
    Err(e) => {
        match e {
            LlmConnectorError::AuthenticationError(msg) => {
                eprintln!("Auth error: {}", msg);
            }
            LlmConnectorError::RateLimitError(msg) => {
                eprintln!("Rate limit: {}", msg);
            }
            LlmConnectorError::UnsupportedOperation(msg) => {
                eprintln!("Not supported: {}", msg);
            }
            _ => eprintln!("Error: {}", e),
        }
    }
}

Configuration

Simple API Key (Recommended)

let client = LlmClient::openai("your-api-key");

Environment Variables

export OPENAI_API_KEY="sk-your-key"
export ANTHROPIC_API_KEY="sk-ant-your-key"
export ALIYUN_API_KEY="sk-your-key"
use std::env;

let api_key = env::var("OPENAI_API_KEY")?;
let client = LlmClient::openai(&api_key, None);

Protocol Information

let client = LlmClient::openai("sk-...")?;

// Get provider name
println!("Provider: {}", client.provider_name());

// Fetch models online (requires API call)
let models = client.models().await?;
println!("Available models: {:?}", models);

Reasoning Synonyms

Many providers return hidden or provider-specific keys for model reasoning content (chain-of-thought). To simplify usage across providers, we normalize four common keys:

  • reasoning_content, reasoning, thought, thinking

Post-processing automatically scans raw JSON and fills these optional fields on both regular messages (Message) and streaming deltas (Delta). You can read the first available value via a convenience method:

// Non-streaming
let msg = &response.choices[0].message;
if let Some(reason) = msg.reasoning_any() {
    println!("Reasoning: {}", reason);
}

// Streaming
while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if let Some(reason) = chunk.choices[0].delta.reasoning_any() {
        println!("Reasoning (stream): {}", reason);
    }
}

Notes:

  • Fields remain None if the provider does not return any reasoning keys.
  • The normalization is provider-agnostic and applied uniformly to OpenAI, Anthropic, Aliyun (Qwen), Zhipu (GLM), and DeepSeek flows (including streaming).
  • StreamingResponse also backfills its top-level reasoning_content from the first delta that contains reasoning.

Debugging & Troubleshooting

Test Your API Keys

Quickly test if your API keys are valid:

# Test all keys from keys.yaml
cargo run --example test_keys_yaml

# Debug DeepSeek specifically
cargo run --example debug_deepseek -- sk-your-key

The test tool will:

  • โœ… Validate API key format
  • โœ… Test authentication with the provider
  • โœ… Show exactly what's wrong if a key fails
  • โœ… Provide specific fix instructions

Troubleshooting Guides

  • TROUBLESHOOTING.md - Comprehensive troubleshooting guide
  • HOW_TO_TEST_YOUR_KEYS.md - How to test your API keys
  • TEST_YOUR_DEEPSEEK_KEY.md - Quick start for DeepSeek users

Common Issues

Authentication Error:

โŒ Authentication failed: Incorrect API key provided

Solutions:

  1. Verify your API key is correct (no extra spaces)
  2. Check if your account has credits
  3. Generate a new API key from your provider's dashboard
  4. Run cargo run --example test_keys_yaml to diagnose

Recent Changes

v0.3.13 (Latest)

๐Ÿš€ Universal Streaming Format Abstraction

  • StreamFormat Enum: Support for JSON, SSE, and NDJSON output formats
  • StreamChunk Universal Container: Unified abstraction for all streaming responses
  • Format Conversion Methods: to_json(), to_sse(), to_ndjson(), to_format()
  • Content Extraction: Universal extract_content() method for both OpenAI and Ollama formats

๐ŸŽฏ New Streaming Methods:

  • chat_stream_universal() - Most flexible interface with full format control
  • chat_stream_sse() - Convenient Server-Sent Events format for web apps
  • chat_stream_ndjson() - Convenient Newline-Delimited JSON for data pipelines
  • Enhanced StreamingConfig with separate content and output format controls

๐Ÿ”ง Architecture Improvements:

  • Separation of Concerns: Content format (OpenAI/Ollama) vs Output format (JSON/SSE/NDJSON)
  • Format Abstraction: No more hardcoded JSON strings in streaming responses
  • Extensible Design: Easy to add new output formats in the future
  • Type Safety: Strong typing for all format options

๐Ÿ’ก Use Cases:

  • Web Applications: Use SSE format for real-time streaming
  • API Services: Use JSON format for standard responses
  • Data Processing: Use NDJSON format for logs and pipelines
  • Tool Integration: Combine any content format with any output format

๐Ÿ“š Enhanced Documentation:

  • Comprehensive format comparison table
  • Detailed usage examples for each format
  • Clear migration guide from previous versions

v0.3.12

๐Ÿ”ง Critical Fix: Pure Ollama Format Streaming

  • Fixed Double Format Issue: chat_stream_ollama() now returns pure Ollama format instead of nested format
  • Direct Compatibility: Perfect integration with Zed.dev and other Ollama-compatible tools
  • Simplified Usage: No more JSON parsing required - direct OllamaStreamChunk access
  • Backward Compatibility: Added chat_stream_ollama_embedded() for legacy nested format

๐ŸŽฏ Format Changes:

  • Before: Ollama JSON embedded in OpenAI format content field (required parsing)
  • After: Direct OllamaStreamChunk objects with native field access
  • New Type: OllamaChatStream for pure Ollama format streams
  • Enhanced API: Cleaner, more intuitive streaming interface

๐Ÿ“š Updated Documentation:

  • Clear distinction between pure and embedded Ollama formats
  • Updated examples with direct field access patterns
  • Enhanced streaming format comparison section

๐Ÿงช New Examples:

  • test_pure_ollama_format.rs - Validation of pure format output
  • Updated ollama_streaming_simple.rs - Demonstrates direct field access

v0.3.11

๐Ÿš€ Major New Features:

  • Multiple Streaming Formats: Support for both OpenAI and Ollama streaming formats
    • chat_stream_ollama() - Ollama-compatible streaming for Zed.dev integration
    • chat_stream_with_format() - Custom streaming configuration
    • StreamingFormat::OpenAI and StreamingFormat::Ollama options
  • Enhanced Tool Integration: Perfect compatibility with Zed.dev and other Ollama-compatible tools
  • Tencent Hunyuan Native API: Initial implementation of TC3-HMAC-SHA256 signature authentication
    • hunyuan_native() - Native Tencent Cloud API support
    • Full region support (ap-beijing, ap-shanghai, ap-guangzhou)
    • Better error handling and debugging capabilities

๐Ÿ”ง Improvements:

  • Streaming Format Conversion: Automatic conversion between OpenAI and Ollama formats
  • Done Marker Handling: Proper done: true final chunk for Ollama format
  • Usage Statistics: Complete token usage and timing information in Ollama format
  • Backward Compatibility: All existing streaming code continues to work unchanged

๐Ÿ“š Documentation:

  • Complete streaming format comparison and usage examples
  • New examples: ollama_streaming_simple.rs, streaming_ollama_format.rs
  • Updated README with detailed format explanations
  • Enhanced troubleshooting guides for streaming

๐ŸŽฏ Breaking Changes:

  • None - all changes are backward compatible

v0.3.8

๐Ÿš€ Major Stability and Debugging Improvements:

  • Enhanced Timeout Configuration: All providers now support custom timeout settings
    • LlmClient::openai_with_timeout() - OpenAI with custom timeout
    • LlmClient::anthropic_with_timeout() - Anthropic with custom timeout
    • LlmClient::zhipu_with_timeout() - Zhipu with custom timeout
    • Default timeout increased to 30 seconds for better stability
  • Advanced Debugging Support: Comprehensive request/response debugging
    • LLM_DEBUG_REQUEST_RAW=1 - Show detailed request information
    • LLM_DEBUG_RESPONSE_RAW=1 - Show response status and headers
    • LLM_DEBUG_STREAM_RAW=1 - Show streaming response details
    • Enhanced error messages with specific troubleshooting guidance
  • Zhipu Stability Improvements: Dedicated tools for diagnosing Zhipu API issues
    • New zhipu_stability_test.rs example for comprehensive testing
    • Improved error handling and timeout management
    • Better connection stability monitoring

๐Ÿ”ง New Examples:

  • enhanced_error_handling.rs - Comprehensive error handling and debugging
  • unified_config.rs - Unified configuration interface for all providers
  • zhipu_stability_test.rs - Dedicated Zhipu stability testing tool

๐Ÿ“š Documentation:

  • Updated troubleshooting guides with timeout configuration
  • Enhanced error handling examples
  • Improved debugging instructions

v0.3.1

๐Ÿš€ Major New Features:

  • Complete Ollama Model Management: Full CRUD operations for local models
    • list_models() - List all installed models
    • pull_model() - Download models from registry
    • push_model() - Upload models to registry
    • delete_model() - Remove local models
    • show_model() - Get detailed model information
  • Enhanced Anthropic Streaming: Proper event state management
    • Correct handling of message_start, content_block_delta, message_delta, message_stop events
    • Real-time token usage tracking during streaming
    • Improved error resilience and state management

๐Ÿ”ง Improvements:

  • Expanded Model Discovery Support:
    • Added Ollama model listing via /api/tags endpoint
    • Limited Anthropic model discovery support
  • Enhanced Client Interface: New methods for Ollama model management
  • Updated Examples: Added comprehensive model management and streaming examples

๐Ÿ“š Documentation:

  • Complete rewrite of Ollama section with model management examples
  • Enhanced streaming documentation with code examples
  • Updated feature descriptions and supported operations

v0.2.3

๐Ÿ”ง Breaking Changes:

  • Removed supported_models() method - Use fetch_models() instead
  • Removed supports_model() method - No longer needed

โœจ New Features:

  • Improved error messages - Removed confusing OpenAI URLs for other providers
  • New debugging tools:
    • examples/test_keys_yaml.rs - Test all API keys
    • examples/debug_deepseek.rs - Debug DeepSeek authentication
  • Comprehensive documentation:
    • TROUBLESHOOTING.md - Troubleshooting guide
    • HOW_TO_TEST_YOUR_KEYS.md - Testing instructions
    • TEST_YOUR_DEEPSEEK_KEY.md - Quick start guide

Migration from v0.2.2:

// โŒ Old (no longer works)
let models = client.supported_models();

// โœ… New
let models = client.fetch_models().await?;

v0.2.2

โœจ New Features:

  • Added fetch_models() for online model discovery
  • OpenAI protocol supports dynamic model fetching from /v1/models endpoint
  • Works with OpenAI-compatible providers (DeepSeek, Zhipu, Moonshot, etc.)

Design Philosophy

Minimal by Design:

  • Only 4 protocols to cover all major LLM providers
  • No hardcoded model restrictions - use any model name
  • No complex configuration files or registries
  • Direct API usage with clear abstractions

Protocol-first:

  • Group providers by API protocol, not by company
  • OpenAI-compatible providers share one implementation
  • Extensible through protocol adapters

Examples

Check out the examples/ directory:

# Test your API keys from keys.yaml
cargo run --example test_keys_yaml

# Debug DeepSeek authentication
cargo run --example debug_deepseek -- sk-your-key

# Simple fetch_models() demo
cargo run --example fetch_models_simple

# Ollama model management (NEW!)
cargo run --example ollama_model_management

# Anthropic streaming (NEW! - requires streaming feature)
cargo run --example anthropic_streaming --features streaming

# Ollama streaming (NEW! - requires streaming feature)
cargo run --example ollama_streaming --features streaming

# LongCat demo (OpenAI/Anthropic compatible)
cargo run --example longcat_dual

Example Descriptions

test_keys_yaml.rs โญ New!

  • Tests all API keys from your keys.yaml file
  • Validates API key format and authentication
  • Provides specific troubleshooting for each error
  • Run this first if you have authentication issues!

debug_deepseek.rs โญ New!

  • Interactive debugging tool for DeepSeek API
  • Validates API key format
  • Tests model fetching and chat requests
  • Provides detailed troubleshooting guidance

fetch_models_simple.rs

  • Simple demonstration of fetch_models()
  • Shows how to fetch models from OpenAI-compatible providers
  • Includes usage recommendations

ollama_model_management.rs โญ New!

  • Demonstrates complete Ollama model management functionality
  • Shows how to list, pull, delete, and get model details
  • Includes error handling and practical usage examples

anthropic_streaming.rs โญ New!

  • Shows enhanced Anthropic streaming with proper event handling
  • Demonstrates real-time response streaming and usage tracking
  • Includes both regular and streaming chat examples

Removed redundant examples

  • test_fetch_models.rs and test_with_keys.rs were overlapping with other examples and have been removed.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT