# LLM Provider Expansion Guide
**Paladin Multi-Provider Support**
This document provides a comprehensive comparison of LLM providers supported by Paladin and guidance for configuring and using them effectively.
---
## Table of Contents
- [Overview](#overview)
- [Provider Comparison](#provider-comparison)
- [Configuration Guide](#configuration-guide)
- [Use Case Recommendations](#use-case-recommendations)
- [Migration Guide](#migration-guide)
- [Performance Characteristics](#performance-characteristics)
---
## Overview
Paladin supports multiple LLM providers out of the box, allowing you to choose the best provider for your specific needs. All providers implement the same `LlmPort` trait, making it easy to switch between them without changing your application logic.
### Supported Providers
1. **OpenAI** (GPT-4, GPT-3.5-turbo, GPT-4 Vision)
2. **DeepSeek** (DeepSeek-Chat, DeepSeek-Coder)
3. **Anthropic** (Claude 3.5 Sonnet, Claude 3 Opus/Haiku)
---
## Provider Comparison
| **Streaming** | ✅ Yes | ✅ Yes | ✅ Yes |
| **Tool Calling** | ✅ Yes | ✅ Yes | ✅ Yes |
| **Function Calling** | ✅ Yes | ✅ Yes | ✅ Yes |
| **Vision/Images** | ✅ GPT-4V | ❌ No | ✅ Claude 3+ |
| **Max Context** | 128K (GPT-4) | 64K | 200K (Claude 3) |
| **Best For** | General purpose, production | Cost-effective, reasoning | Safety-critical, analysis |
| **Pricing** | $$ | $ | $$$ |
| **Latency** | Low | Low | Low-Medium |
### Detailed Feature Matrix
#### OpenAI
- **Strengths**:
- Most mature ecosystem with extensive tooling
- Wide range of models (GPT-4, GPT-3.5-turbo, GPT-4 Vision)
- Excellent for general-purpose applications
- Strong vision/multimodal capabilities
- Large community and documentation
- **Limitations**:
- Higher cost compared to alternatives
- Context window smaller than Claude
- Rate limiting on free tier
- **Ideal Use Cases**:
- Production deployments requiring reliability
- Applications needing vision/image analysis
- General-purpose AI assistants
- Well-documented, standard use cases
#### DeepSeek
- **Strengths**:
- Most cost-effective option
- Strong reasoning and code generation
- High throughput capabilities
- Good for analytical tasks
- Competitive performance at lower cost
- **Limitations**:
- Smaller context window (64K)
- No vision support
- Newer ecosystem, less community resources
- **Ideal Use Cases**:
- Cost-sensitive deployments
- Code generation and analysis
- Logical reasoning tasks
- High-volume/batch processing
- Internal tooling and development
#### Anthropic Claude
- **Strengths**:
- Largest context window (200K tokens)
- Strong safety and ethical guidelines
- Excellent for complex analysis
- Superior long-document processing
- Strong instruction following
- **Limitations**:
- Higher cost
- Claude-specific API differences (system messages separate)
- Requires max_tokens parameter
- **Ideal Use Cases**:
- Safety-critical applications
- Complex document analysis
- Long-context reasoning
- Compliance and governance
- Medical/legal/financial applications
---
## Configuration Guide
### Environment Variables
All providers can be configured via environment variables:
```bash
# OpenAI
export OPENAI_API_KEY="sk-..."
# DeepSeek
export DEEPSEEK_API_KEY="..."
export DEEPSEEK_BASE_URL="https://api.deepseek.com/v1" # Optional
export DEEPSEEK_MODEL="deepseek-chat" # Optional
# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
export ANTHROPIC_BASE_URL="https://api.anthropic.com/v1" # Optional
export ANTHROPIC_MODEL="claude-3-5-sonnet-20241022" # Optional
```
### Configuration Files
Add provider configurations to `config.yml`:
```yaml
llm:
# Default provider if multiple are configured
default_provider: "openai"
openai:
api_key: "${OPENAI_API_KEY}"
base_url: "https://api.openai.com/v1"
model: "gpt-4"
timeout_seconds: 30
deepseek:
api_key: "${DEEPSEEK_API_KEY}"
base_url: "https://api.deepseek.com/v1"
model: "deepseek-chat"
timeout_seconds: 60
anthropic:
api_key: "${ANTHROPIC_API_KEY}"
base_url: "https://api.anthropic.com/v1"
model: "claude-3-5-sonnet-20241022"
timeout_seconds: 30
```
### Programmatic Configuration
#### OpenAI
```rust
use paladin::infrastructure::adapters::llm::openai_adapter::OpenAILlmAdapter;
use std::time::Duration;
let adapter = OpenAILlmAdapter::new(
api_key,
None, // Use default base URL
Some(Duration::from_secs(30))
)?;
```
#### DeepSeek
```rust
use paladin::infrastructure::adapters::llm::deepseek_adapter::{
DeepSeekAdapter, DeepSeekConfig
};
// From environment
let config = DeepSeekConfig::from_env()?;
let adapter = DeepSeekAdapter::new(config)?;
// Or custom
let config = DeepSeekConfig::new(
api_key,
"https://api.deepseek.com/v1".to_string(),
"deepseek-chat".to_string()
);
let adapter = DeepSeekAdapter::new(config)?;
```
#### Anthropic
```rust
use paladin::infrastructure::adapters::llm::anthropic_adapter::{
AnthropicAdapter, AnthropicConfig
};
// From environment
let config = AnthropicConfig::from_env()?;
let adapter = AnthropicAdapter::new(config)?;
// Or custom
let config = AnthropicConfig::new(
api_key,
"https://api.anthropic.com/v1".to_string(),
"claude-3-5-sonnet-20241022".to_string()
);
let adapter = AnthropicAdapter::new(config)?;
```
---
## Use Case Recommendations
### When to Use OpenAI
**Best for:**
- General-purpose AI applications
- Production deployments requiring proven reliability
- Applications needing vision/image analysis
- Multimodal applications
- Projects with complex tooling requirements
**Example Use Cases:**
- Customer support chatbots
- Content generation systems
- Image analysis and description
- General AI assistants
- Document Q&A systems
### When to Use DeepSeek
**Best for:**
- Cost-sensitive deployments
- Code generation and analysis
- Logical reasoning tasks
- High-volume batch processing
- Internal development tools
**Example Use Cases:**
- Code review automation
- Test generation
- Documentation generation
- Internal knowledge bases
- Analytical pipelines
### When to Use Anthropic Claude
**Best for:**
- Safety-critical applications
- Long-document analysis
- Complex reasoning tasks
- Compliance-sensitive domains
- High-stakes decision support
**Example Use Cases:**
- Legal document analysis
- Medical record processing
- Financial compliance checking
- Research paper analysis
- Complex contract review
---
## Migration Guide
### From OpenAI to DeepSeek
DeepSeek uses an OpenAI-compatible API, making migration straightforward:
```rust
// Before (OpenAI)
let llm_port = Arc::new(OpenAILlmAdapter::new(
openai_key,
None,
Some(Duration::from_secs(30))
)?);
// After (DeepSeek)
let config = DeepSeekConfig::from_env()?;
let llm_port = Arc::new(DeepSeekAdapter::new(config)?);
// Your Paladin code remains the same
let paladin = PaladinBuilder::new(llm_port)
.system_prompt("Your prompt")
.build()?;
```
**Considerations:**
- DeepSeek has no vision support
- Context window is 64K vs 128K for GPT-4
- Response style may differ slightly
### From OpenAI to Anthropic
Anthropic Claude requires some adjustments due to API differences:
```rust
// Before (OpenAI)
let llm_port = Arc::new(OpenAILlmAdapter::new(
openai_key,
None,
Some(Duration::from_secs(30))
)?);
// After (Anthropic)
let config = AnthropicConfig::from_env()?;
let llm_port = Arc::new(AnthropicAdapter::new(config)?);
// Your Paladin code remains the same
let paladin = PaladinBuilder::new(llm_port)
.system_prompt("Your prompt")
.build()?;
```
**Key Differences:**
- Claude requires `max_tokens` parameter (defaults to 4096)
- System messages are sent separately
- Larger context window (200K tokens)
- Different SSE streaming format
### Provider Fallback Pattern
Implement graceful fallback for higher reliability:
```rust
use paladin::paladin_ports::output::llm_port::LlmPort;
use std::sync::Arc;
fn create_llm_provider() -> Result<Arc<dyn LlmPort>, Box<dyn std::error::Error>> {
// Try DeepSeek first (cost-effective)
if let Ok(config) = DeepSeekConfig::from_env() {
if let Ok(adapter) = DeepSeekAdapter::new(config) {
return Ok(Arc::new(adapter));
}
}
// Fallback to Anthropic (powerful)
if let Ok(config) = AnthropicConfig::from_env() {
if let Ok(adapter) = AnthropicAdapter::new(config) {
return Ok(Arc::new(adapter));
}
}
// Final fallback to OpenAI (default)
let api_key = std::env::var("OPENAI_API_KEY")?;
Ok(Arc::new(OpenAILlmAdapter::new(
api_key,
None,
Some(Duration::from_secs(30))
)?))
}
```
---
## Performance Characteristics
### Latency Comparison (Approximate)
| OpenAI GPT-4 | 500-800ms | 1-2s | Medium |
| OpenAI GPT-3.5 | 200-400ms | 500ms-1s | High |
| DeepSeek | 300-600ms | 800ms-1.5s | High |
| Anthropic Claude | 400-700ms | 1-2s | Medium |
*Note: Actual performance varies based on request size, load, and region*
### Cost Comparison (Approximate)
**Per 1M Tokens (Input/Output):**
| OpenAI | GPT-4 | $10 | $30 |
| OpenAI | GPT-3.5-turbo | $0.50 | $1.50 |
| DeepSeek | deepseek-chat | $0.10 | $0.20 |
| Anthropic | Claude 3.5 Sonnet | $3 | $15 |
*Prices are approximate and subject to change*
### Scaling Considerations
**OpenAI:**
- Rate limits: Tier-based (requests/min, tokens/min)
- Horizontal scaling: Good
- Burst capacity: Moderate
**DeepSeek:**
- Rate limits: Generous
- Horizontal scaling: Excellent (high throughput)
- Burst capacity: High
**Anthropic:**
- Rate limits: Tier-based
- Horizontal scaling: Good
- Burst capacity: Moderate
---
## Best Practices
### 1. Use Provider Capabilities
Query provider capabilities before attempting operations:
```rust
let caps = provider.get_capabilities();
if caps.supports_vision {
// Send image-based requests
}
if caps.supports_streaming {
// Use streaming for better UX
}
```
### 2. Set Appropriate Timeouts
Different providers may have different response times:
```rust
// Higher timeout for Claude with long contexts
let claude_config = AnthropicConfig::new(/* ... */);
// Timeout handled internally
// Standard timeout for others
let openai = OpenAILlmAdapter::new(
api_key,
None,
Some(Duration::from_secs(30))
)?;
```
### 3. Handle Provider-Specific Errors
```rust
match provider.generate(&request).await {
Ok(response) => // Handle response,
Err(LlmError::RateLimitExceeded { retry_after }) => {
tokio::time::sleep(Duration::from_secs(retry_after)).await;
// Retry
}
Err(LlmError::AuthenticationError(_)) => {
// Check API keys
}
Err(e) => // Handle other errors
}
```
### 4. Monitor Usage and Costs
```rust
let response = provider.generate(&request).await?;
// Log token usage
println!("Input tokens: {}", response.usage.prompt_tokens);
println!("Output tokens: {}", response.usage.completion_tokens);
println!("Total cost: ${}", calculate_cost(&response, provider_name));
```
---
## Troubleshooting
### Authentication Errors
**Issue:** `LlmError::AuthenticationError`
**Solutions:**
1. Verify API key is set correctly
2. Check API key has necessary permissions
3. Ensure API key hasn't expired
4. Verify base URL is correct for your region
### Rate Limiting
**Issue:** `LlmError::RateLimitExceeded`
**Solutions:**
1. Implement exponential backoff (built-in to adapters)
2. Consider upgrading API tier
3. Implement request queuing
4. Switch to provider with higher limits
### Timeout Errors
**Issue:** `LlmError::Timeout`
**Solutions:**
1. Increase timeout duration
2. Reduce request complexity
3. Check network connectivity
4. Consider switching to streaming mode
### Context Length Errors
**Issue:** `LlmError::InvalidRequest` (context too long)
**Solutions:**
1. Reduce input size
2. Switch to provider with larger context (Claude: 200K)
3. Implement context windowing
4. Summarize older conversation history
---
## Additional Resources
- [Paladin Examples](../examples/) - Working code examples
- [Contributing Providers Guide](./CONTRIBUTING_PROVIDERS.md) - Add new providers
- [API Documentation](https://docs.rs/paladin) - Full API reference
- [GitHub Issues](https://github.com/DF3NDR/paladin/issues) - Report issues
---
**Last Updated:** January 2026
**Version:** 0.1.0