Octolib: Self-Sufficient AI Provider Library
© 2025 Muvon Un Limited (Hong Kong) | Website | Product Page
🚀 Overview
Octolib is a comprehensive, self-sufficient AI provider library that provides a unified, type-safe interface for interacting with multiple AI services. It offers intelligent model selection, robust error handling, and advanced features like cross-provider tool calling and vision support.
✨ Key Features
- 🔌 Multi-Provider Support: OpenAI, Anthropic, OpenRouter, Cerebras, Ollama, Together, Google, Amazon, Cloudflare, DeepSeek, MiniMax, Moonshot AI (Kimi), Z.ai, OctoHub, Local, CLI proxies
- 🛡️ Unified Interface: Consistent API across different providers
- 🔍 Intelligent Model Validation: Strict
provider:modelformat parsing with case-insensitive model support - 📋 Structured Output: JSON and JSON Schema support for OpenAI, OpenRouter, DeepSeek, Together, and Z.ai
- 💰 Cost Tracking: Automatic token usage and cost calculation
- 🖼️ Vision Support: Image and video attachment handling for compatible models (Moonshot Kimi K2.5)
- 🧰 Tool Calling: Cross-provider tool call standardization
- 🧩 CLI Provider: Use
cli:<backend>/<model>(e.g.cli:codex/gpt-5.2-codex). Proxy-only: tools/MCP are not used or controllable. - ⏱️ Retry Management: Configurable exponential backoff
- 🔒 Secure Design: Environment-based API key management
- 🎯 Embedding Support: Multi-provider embedding generation with Jina, Voyage, Google, OpenAI, Together, OctoHub, FastEmbed, and HuggingFace
- 🔄 Reranking: Document relevance scoring with cross-encoder models (Voyage AI, Cohere, Jina AI, Mixedbread, HuggingFace)
📦 Quick Installation
# Add to Cargo.toml
🚀 Quick Start
use ;
async
📋 Structured Output
Get structured JSON responses with schema validation:
use ;
use ;
async
🧩 CLI Provider (Proxy Mode)
Use local CLIs as a lightweight proxy. This mode is prompt-only; tool calling/MCP integration is not used or controllable.
let = get_provider_for_model
Set a backend-specific command if it is not on PATH:
CLI_CODEX_COMMAND=/path/to/codex
CLI_CLAUDE_COMMAND=/path/to/claude
CLI_GEMINI_COMMAND=/path/to/gemini
CLI_CURSOR_COMMAND=/path/to/cursor-agent
🧰 Tool Calling
Use AI models to call functions with automatic parameter extraction:
use ;
use json;
async
Tool Calling Features:
- ✅ Cross-provider support (OpenAI, Anthropic, Google, Amazon, OpenRouter)
- ✅ Automatic parameter validation via JSON Schema
- ✅ Multi-turn conversations with tool results
- ✅ Parallel tool execution support
- ✅ Standardized
ToolCallandGenericToolCallformats across all providers - ✅ Provider-specific metadata preservation (e.g., Gemini thought signatures)
- ✅ Clean conversion API with
to_generic_tool_calls()method
🎯 Embedding Generation
Generate embeddings using multiple providers:
use ;
async
// Supported embedding providers:
// - Jina: jina-embeddings-v4, jina-clip-v2, etc.
// - Voyage: voyage-3.5, voyage-code-2, etc.
// - Google: gemini-embedding-001, text-embedding-005
// - OpenAI: text-embedding-3-small, text-embedding-3-large
// - FastEmbed: Local models (feature-gated)
// - HuggingFace: sentence-transformers models
🎯 Document Reranking
Improve search results by scoring document relevance with cross-encoder models:
use rerank;
async
// Supported Providers:
//
// API-Based (require API keys):
// - Voyage AI (VOYAGE_API_KEY): rerank-2.5, rerank-2.5-lite, rerank-2, rerank-2-lite
// - Cohere (COHERE_API_KEY): rerank-english-v3.0, rerank-multilingual-v3.0
// - Jina AI (JINA_API_KEY): jina-reranker-v3, jina-reranker-v2-base-multilingual
//
// Local (no API keys, requires features):
// - FastEmbed (fastembed feature): bge-reranker-base, bge-reranker-large, jina-reranker-v1-turbo-en
🔐 OAuth Authentication
Octolib supports OAuth authentication for ChatGPT subscriptions and Anthropic:
OpenAI OAuth (ChatGPT Plus/Pro/Team/Enterprise):
Anthropic OAuth:
The library automatically detects OAuth credentials and prefers them over API keys. See examples/openai_oauth.rs and examples/anthropic_oauth.rs for full usage examples.
🎯 Provider Support Matrix
| Provider | Structured Output | Vision | Tool Calls | Caching |
|---|---|---|---|---|
| OpenAI | ✅ JSON + Schema | ✅ Yes | ✅ Yes | ✅ Yes |
| OpenRouter | ✅ JSON + Schema | ✅ Yes | ✅ Yes | ✅ Yes |
| DeepSeek | ✅ JSON Mode | ❌ No | ❌ No | ✅ Yes |
| Moonshot AI (Kimi) | ✅ JSON Mode | ✅ kimi-k2.5 | ✅ Yes | ✅ Yes |
| MiniMax | ✅ JSON Mode | ❌ No | ✅ Yes | ✅ Yes |
| Anthropic | ❌ No | ✅ Yes | ✅ Yes | ✅ Yes |
| Z.ai | ✅ JSON Mode | ❌ No | ✅ Yes | ✅ Yes |
| Google Vertex | ❌ No | ✅ Yes | ✅ Yes | ❌ No |
| Amazon Bedrock | ❌ No | ✅ Yes | ✅ Yes | ❌ No |
| Cloudflare | ❌ No | ❌ No | ❌ No | ❌ No |
Structured Output Details
- JSON Mode: Basic JSON object output
- JSON Schema: Full schema validation with strict mode
- Provider Detection: Use
provider.supports_structured_output(&model)to check capability
🧠 Thinking/Reasoning Support
Octolib provides first-class support for models that produce thinking/reasoning content. Thinking is stored separately from the main response content, similar to how tool_calls are separate from content.
use ;
async
Supported Providers
| Provider | Thinking Format | Notes |
|---|---|---|
| MiniMax | Content blocks ({"type": "thinking"}) |
Full thinking block extraction |
| OpenAI o-series | reasoning_content field |
o1, o3, o4 models |
| OpenRouter | reasoning_details |
Gemini and other providers |
Token Tracking
Thinking tokens are tracked separately in TokenUsage.reasoning_tokens:
if let Some = &response.exchange.usage
📚 Complete Documentation
📖 Quick Navigation
- Overview - Library introduction and core concepts
- Installation Guide - Setup and configuration
- Advanced Usage - Advanced features and customization
- Advanced Guide - Comprehensive usage patterns
- Embedding Guide - Embedding generation with multiple providers
- Reranking Guide - Document relevance scoring
- Tool Calling - Cross-provider tool calling
- Thinking/Reasoning - Reasoning model support
🌐 Supported Providers
| Provider | Status | Capabilities |
|---|---|---|
| OpenAI | ✅ Full Support | Chat, Vision, Tools, Structured Output, Caching |
| Anthropic | ✅ Full Support | Claude Models, Vision, Tools, Caching |
| OpenRouter | ✅ Full Support | Multi-Provider Proxy, Vision, Caching, Structured Output |
| DeepSeek | ✅ Full Support | Open-Source AI Models, Structured Output, Caching |
| Moonshot AI (Kimi) | ✅ Full Support | Kimi K2 Series, Vision (kimi-k2.5), Tools, Structured Output, Caching |
| MiniMax | ✅ Full Support | Anthropic-Compatible API, Tools, Caching, Thinking, Structured Output |
| Z.ai | ✅ Full Support | GLM Models, Caching, Structured Output |
| Google Vertex AI | ✅ Supported | Enterprise AI Integration |
| Amazon Bedrock | ✅ Supported | Cloud AI Services |
| Cloudflare Workers AI | ✅ Supported | Edge AI Compute |
🔒 Privacy & Security
- 🏠 Local-first design
- 🔑 Secure API key management
- 📁 Respects .gitignore
- 🛡️ Comprehensive error handling
🤝 Support & Community
- 🐛 Issues: GitHub Issues
- 📧 Email: opensource@muvon.io
- 🏢 Company: Muvon Un Limited (Hong Kong)
⚖️ License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Built with ❤️ by the Muvon team in Hong Kong