llmrs
A focused Rust SDK for calling IBM WatsonX APIs: watsonx.ai (text generation) and watsonx.orchestrate (assistants and chat).
- watsonx.ai — generate text, stream, list models, batch, chat completion
- watsonx.orchestrate — list agents, create threads, send/stream messages
Optional features: data and governance (off by default). See ARCHITECTURE.md.
Orchestrate API reference: https://developer.ibm.com/apis/catalog/watsonorchestrate--custom-assistants/api
🚀 Quick Start (5 Minutes)
1. Add to Cargo.toml
[]
= "0.1"
= { = "1.0", = ["full"] }
2. Set up credentials
Copy .env.example to .env and set values locally. Do not commit .env.
Env var names are listed in .env.example and in src/env.rs.
3. Generate text with WatsonX AI (One-Line Connection!)
use ;
async
4. Chat with Watson Orchestrate (One-Line Connection!)
use OrchestrateConnection;
async
📖 Core Usage Patterns
Important: You must specify a model before generating text. Use
GenerationConfig::default().with_model(model_id)to set the model.
Pattern 1: Simple Text Generation
use ;
// Set the model and generate text
let config = default
.with_model;
let result = client.generate_text.await?;
println!;
Pattern 2: Streaming for Real-time Output
use ;
// Perfect for interactive applications
let config = default
.with_model;
let result = client.generate_text_stream.await?;
Pattern 3: Custom Configuration
use ;
let config = default
.with_model
.with_max_tokens
.with_top_p;
let result = client.generate_text.await?;
Pattern 4: List Available Models
// Discover what models are available
let models = client.list_models.await?;
for model in models
🤖 Available Models
Popular Models
use models;
// IBM Granite models
GRANITE_4_H_SMALL // Default, best performance
GRANITE_3_3_8B_INSTRUCT // Good balance of speed/quality
GRANITE_3_2_8B_INSTRUCT // Fast generation
// Meta Llama models
LLAMA_3_3_70B_INSTRUCT // High quality, slower
LLAMA_3_1_8B // Good for most tasks
// Mistral models
MISTRAL_MEDIUM_2505 // Excellent quality
MISTRAL_SMALL_3_1_24B_INSTRUCT_2503 // Fast and efficient
// Groq GPT-OSS models (via watsonx Orchestrate 2.0.0+)
GPT_OSS_120B // High-capability agentic use (120B params)
GPT_OSS_20B // Cost-efficient deployment (20B params)
Discover Models Dynamically
// Get all available models
let models = client.list_models.await?;
for model in models
🎛️ Configuration Options
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
| (credential) | ✅ | - | Set in .env; see .env.example |
WATSONX_PROJECT_ID |
✅ | - | WatsonX project ID |
WATSONX_API_URL |
❌ | https://us-south.ml.cloud.ibm.com |
API base URL |
WATSONX_API_VERSION |
❌ | 2023-05-29 |
API version |
WATSONX_TIMEOUT_SECS |
❌ | 120 |
Request timeout |
Generation Parameters
let config = default
.with_model // Model to use
.with_max_tokens // Max tokens to generate
.with_top_p // Nucleus sampling
.with_top_k // Top-k sampling
.with_repetition_penalty // Reduce repetition
.with_stop_sequences; // Stop tokens
🎯 When to Use Each Method
Use generate_text() when:
- ✅ You need the complete response before processing
- ✅ Batch processing multiple prompts
- ✅ Building APIs that return complete responses
- ✅ Simple, synchronous-style workflows
Use generate_text_stream() when:
- ✅ Building interactive chat applications
- ✅ Real-time user experience is important
- ✅ Processing long responses incrementally
- ✅ Building streaming APIs
Use generate_batch() or generate_batch_simple() when:
- ✅ Processing multiple prompts concurrently
- ✅ Need to maximize throughput
- ✅ Want to collect all results at once
- ✅ Each request can succeed or fail independently
🔄 Batch Generation
Batch generation allows you to process multiple prompts concurrently, improving throughput and efficiency.
Pattern 1: Simple Batch with Uniform Configuration
use ;
async
Pattern 2: Batch with Custom IDs and Mixed Configurations
use ;
async
Batch Result Features
- Concurrent Execution: All requests run in parallel for maximum throughput
- Per-Item Error Handling: Each request can succeed or fail independently
- Result Tracking: Track success/failure counts and duration
- Flexible Configuration: Use default config or per-request configs
- Request IDs: Optional IDs for tracking individual requests
⚙️ WatsonX Orchestrate
The SDK provides comprehensive support for WatsonX Orchestrate with the following capabilities:
Core Features
- Agent Management: List, get, and interact with agents
- Chat & Messaging: Send messages and stream responses with thread management
- Thread Management: List threads and retrieve conversation history
- Skills Management: List and get skills available to agents
- Tools Management: List and get tools available to agents
- Document Collections: Create, manage, and search document collections
- Knowledge Base: Build and query knowledge bases with vector search
- Communication Channels: Manage Twilio WhatsApp, SMS, Slack, and Genesys Bot Connector channels (NEW - v2.1.0)
- Voice Configuration: Configure Deepgram and ElevenLabs for speech-to-text and text-to-speech (NEW - v2.1.0)
Quick Start - Chat with Agents
use ;
async
Environment Setup for Orchestrate
Create a .env file with:
# Required
WXO_INSTANCE_ID=your-instance-id
# Credential: set in .env (see .env.example)
# Optional (defaults to us-south)
WXO_REGION=us-south
Additional Orchestrate Capabilities
use ;
// Get specific agent details
let agent = client.get_agent.await?;
println!;
// List all threads (optionally filter by agent)
let threads = client.list_threads.await?;
for thread in threads
// Get conversation history from a thread
let messages = client.get_thread_messages.await?;
for msg in messages
// List available skills
let skills = client.list_skills.await?;
for skill in skills
// List available tools
let tools = client.list_tools.await?;
for tool in tools
// Get document collection details
let collection = client.get_collection.await?;
println!;
// Get specific document
let document = client.get_document.await?;
println!;
// Delete document
client.delete_document.await?;
Document Collections & Knowledge Base
use ;
use HashMap;
async
Communication Channels (Watsonx Orchestrate 2.1.0+)
The SDK supports managing communication channels for agents, including Twilio WhatsApp, SMS, Slack, and Genesys Bot Connector.
use ;
async
Voice Configuration (Watsonx Orchestrate 2.1.0+)
Configure voice capabilities using Deepgram or ElevenLabs for speech-to-text and text-to-speech.
use ;
async
WatsonX Data (One-Line Connection!)
⚠️ Note: WatsonX Data is temporarily disabled pending API endpoint discovery. The SDK code is complete and tested, but examples are disabled. See docs/disabled-modules/README_WATSONX_DATA_DISABLED.md for details and re-enable instructions.
The SDK provides comprehensive support for WatsonX Data with catalog, schema, table management, and SQL query execution.
use ;
async
Environment Setup for WatsonX Data
Create a .env file with:
Option A - Using Service URL:
# Required: URL and credential (set in .env; see .env.example)
WATSONX_DATA_URL=https://your-watsonx-data-instance.cloud.ibm.com
# Optional (defaults to v3)
WATSONX_DATA_API_VERSION=v3
Option B - Using CRN (Cloud Resource Name):
# Required: CRN and credential (set in .env; see .env.example)
WATSONX_DATA_CRN=crn:v1:bluemix:public:watsonx-data:region:instance_id::
# Optional (defaults to v3)
WATSONX_DATA_API_VERSION=v3
Optional:
# IAM endpoint (defaults to iam.cloud.ibm.com)
IAM_IBM_CLOUD_URL=iam.cloud.ibm.com
Note:
- Credential and env var names:
src/env.rsand.env.exampleonly. Do not repeat in docs. - You can use either
WATSONX_DATA_URLorWATSONX_DATA_CRN(one is required) - If using CRN, the SDK resolves the endpoint URL from the region in the CRN
- The CRN is included in API request headers when provided
WatsonX Governance (One-Line Connection!)
⚠️ Note: WatsonX Governance is temporarily disabled pending Cloud Pak for Data (CPD) authentication support. The SDK currently only supports IBM Cloud IAM authentication. See docs/disabled-modules/README_WATSONX_GOVERNANCE_DISABLED.md for details and re-enable instructions.
The SDK provides comprehensive support for WatsonX Governance with model monitoring, bias detection, and compliance management.
use ;
async
Environment Setup for WatsonX Governance
Create a .env file with:
# Required
WATSONX_GOV_SERVICE_INSTANCE_ID=your-service-instance-id
# Optional (defaults shown)
WATSONX_GOV_BASE_URL=https://api.aiopenscale.cloud.ibm.com
WATSONX_GOV_API_VERSION=2025-09-10
📚 Examples
Run these examples to see the SDK in action:
WatsonX AI Examples
# Basic streaming generation
# Compare streaming vs non-streaming
# List available models
# Use predefined model constants
# Batch generation with concurrent execution
WatsonX Orchestrate Examples
# Basic Orchestrate - list agents
# Chat with agents - streaming and non-streaming
# Advanced capabilities - comprehensive feature test
# Practical use cases - real-world scenarios
# Chat with documents - document-based Q&A
# Test agent documents - document discovery
WatsonX Orchestrate Capabilities
The SDK provides comprehensive support for Watson Orchestrate with robust error handling and graceful degradation:
- Agent Management: List, retrieve, and interact with agents
- Conversation Management: Send messages (streaming and non-streaming) with thread context
- Thread Management: Create, list, and delete conversation threads
- Run Management: Track and cancel agent executions
- Tool Management: List, get, execute, update, delete, test, and track tool execution history
- Tool Versioning: Manage tool versions and rollbacks
- Batch Operations: Process multiple messages efficiently
- Document Collections: Manage knowledge bases with vector search
- Chat with Documents: Ask questions about uploaded documents
- Skill Management: List and retrieve available skills
- Advanced Tool Features: Test tools, track execution history, manage versions
Key Features:
- ✅ Real-time streaming with SSE parsing
- ✅ Flexible response parsing for API variations
- ✅ Graceful degradation for unavailable endpoints
- ✅ Comprehensive error handling
- ✅ Thread-based conversation context
See ORCHESTRATE_CAPABILITIES.md for detailed documentation and TESTING_GUIDE.md for testing instructions.
🔧 Error Handling
The SDK provides comprehensive error handling:
match client.generate_text.await
🤖 WatsonX AI Quick Start
For simplified WatsonX AI connection, see WATSONX_AI_QUICK_START.md.
One-line connection:
let client = new.from_env.await?;
Setup:
# .env file
# Credential: set in .env (see .env.example)
WATSONX_PROJECT_ID=your-project-id
Run example:
For more details, see docs/WATSONX_AI_QUICK_START.md.
🤖 Watson Orchestrate Quick Start
For simplified Watson Orchestrate connection, see QUICK_START.md.
One-line connection:
let client = new.from_env.await?;
Setup:
# .env file
WXO_INSTANCE_ID=your-instance-id
# Orchestrate credential: set in .env (see .env.example)
Run example:
For more details, see docs/QUICK_START.md.
🏗️ Architecture
The SDK is built with:
- Async/Await: Full async support with Tokio
- Type Safety: Strong typing throughout
- Error Handling: Comprehensive error types
- Streaming: Real-time Server-Sent Events processing
- Configuration: Environment-based setup
🚧 Roadmap
Current (watsonx.ai)
- ✅ Text generation (streaming & non-streaming)
- ✅ Model discovery
- ✅ Quality assessment
- ✅ Configuration management
Current (watsonx.orchestrate)
- ✅ Agent management and discovery
- ✅ Conversation with streaming support
- ✅ Thread lifecycle management
- ✅ Tool management (list, get, execute, update, delete, test)
- ✅ Tool versioning and execution history
- ✅ Run tracking and management
- ✅ Document collections and search
- ✅ Chat with documents (Q&A on uploaded docs)
- ✅ Batch message processing
- ✅ Communication channels (Twilio WhatsApp, SMS, Slack, Genesys Bot Connector) (NEW - v2.1.0)
- ✅ Voice configuration (Deepgram, ElevenLabs) (NEW - v2.1.0)
- ✅ Graceful handling of unavailable endpoints
- ✅ Modular code organization (config, client, types)
Planned (watsonx.ai)
- 🔄 Chat completion API
- 🔄 Embeddings generation
- 🔄 Fine-tuning support
- ✅ Batch processing
- ✅ Groq GPT-OSS models (GPT-OSS-120B, GPT-OSS-20B) (NEW - v2.0.0)
Current (watsonx.data)
- ✅ Catalog management (create, list, get, update, delete)
- ✅ Schema management (create, list, get, update, delete)
- ✅ Table management (create, list, get, delete)
- ✅ Column management (list, add)
- ✅ SQL query execution
Current (watsonx.governance)
- ✅ Data mart management (create, list, get, update, delete)
- ✅ Subscription management (create, list)
- ✅ Bias detection and fairness monitoring
- ✅ Model drift detection
- ✅ Monitoring metrics retrieval
🤝 Contributing
We welcome contributions! The SDK is designed to be extensible across the entire WatsonX platform.
📄 License
This project is licensed under the Apache License 2.0.