genai
A Native-Protocol Multi-AI Provider Library for Rust
= "0.6"
genai provides a single, ergonomic Rust API for native-protocol multi-AI provider access, including Anthropic, OpenAI, Gemini, xAI, Ollama, Groq, and more.
Over 200+ LLM models, 25+ LLM providers out of the box, including Ollama for local execution.
Out-of-the-box providers: openai, openai_resp, anthropic, gemini, ollama, ollama_cloud, vertex, bedrock_api, bedrock_sigv4, github_copilot, opencode_go, groq, deepseek, cohere, together, fireworks, nebius, mimo, zai, zai_coding, bigmodel, aliyun, baidu, moonshot, aihubmix, open_router, xai
Also supports custom endpoints and auth with ServiceTargetResolver (see examples/c06-target-resolver.rs) to support any other providers.
// Can talk to any models / providers
let client = default;
let question = "Why is the sky red?";
let chat_req = new;
// Model names can even have a reasoning effort suffix, such as "-high", which will be set, and then removed from name when sent to the provider.
let chat_res = client.exec_chat.await?;
println!;
Docs for LLMs | CHANGELOG | BIG THANKS
v0.6.x Released 🎉
v0.6.0 release date: 2026-05-23
Here’s what’s new:
- New Adapters:
- AWS Bedrock (
bedrock_apiandbedrock_sigv4adapters) open_routervertex(with Gemini and Anthropic support)github_copilot(GitHub Models API)opencode_gobaidualiyunmoonshotaihubmixollama_cloud(Ollama Cloud)
- AWS Bedrock (
- Reasoning effort additions: Added
ReasoningEffort::Maxfor Anthropic andReasoningEffort::XHighfor OpenAI. - ProviderConfig for model listing:
Client::all_model_names(adapter_kind, provider_config)now accepts endpoint and auth overrides, including remote Ollama hosts and custom OpenAI-compatible model listing. - Ollama and Ollama Cloud: Now use the native Ollama API protocol.
- Gemini schema compatibility: Gemini and Vertex Gemini structured output and tool schemas now normalize common JSON Schema shapes, including
const, nullable schema patterns,additionalProperties, and JSON Schema-only keywords rejected by Vertex. - Bound adapter clients:
ClientBuilder::with_adapter_kind(...)andClientConfig::with_adapter_kind(...)bind a client to a single provider adapter, which is useful for proxies, gateways, Azure-style deployment names, and OpenAI-compatible providers with nonstandard model names. - ModelSpec and ServiceTarget: Model arguments can be represented as a model name, explicit
ModelIden, or completeServiceTarget, enabling custom endpoints, auth, and model identity without relying on model-name inference. - OpenAI Responses stateful sessions: OpenAI Responses supports session continuity with
previous_response_id, requeststore, and returnedresponse_id. - Chat extra body:
ChatOptions::with_extra_body(...)provides a low-level request body extension point for provider-specific fields in OpenAI-compatible chat payloads. - Tool choice:
ChatOptions::with_tool_choice(...)adds provider-neutral tool selection hints for automatic, disabled, required, or specific tool calls. - Built-in tools and WebSearch: Added typed built-in tool support, including
ToolName,ToolConfig,WebSearch, and provider mappings for Anthropic, OpenAI, and Gemini. - Prompt cache controls: Chat-level
CacheControlsupport adds provider-specific prompt caching options, including OpenAIprompt_cache_keyand cache retention. - Updated API: Refined
ReasoningContentandStopReasonhandling (v0.6.0-beta.20), includingContentPart::ReasoningContentand provider stop reasons. - Perf Improvements: HTTP requests use performance optimizations such as gzip,
TCP_NODELAY, and HTTP/2 tuning. - Numerous fixes, optimizations, and API enhancements.
See v0.5.x to v0.6.x migration
See CHANGELOG
See BIG-THANKS for contributors
Key Features
- Multi-AI provider/model access optimized per provider: native protocols when available, OpenAI-compatible APIs when appropriate or required, and one common Rust API for OpenAI, OpenAI Responses, Anthropic, Gemini, Ollama, Ollama Cloud, OpenCode Go, Groq, xAI, DeepSeek, Cohere, Together, Fireworks, Nebius, Mimo, Zai, BigModel, Aliyun, Google Vertex, and GitHub Copilot (direct chat and streaming) (see examples/c00-readme.rs)
- Image analysis (for OpenAI, Gemini Flash-2, Anthropic) (see examples/c07-image.rs)
- Custom auth/API key (see examples/c02-auth.rs)
- Model aliases (see examples/c05-model-names.rs)
- Custom endpoint, auth, and model identifier (see examples/c06-target-resolver.rs)
- And much more
Examples | Thanks | Library Focus | Changelog | Provider Mapping: ChatOptions | Usage
Model to Adapter Resolution
By default, the library resolves the AdapterKind (AI provider) based on the model name prefix:
- OpenAI:
gpt-*(most),o1-*,o3-*,o4-*,chatgpt-*,codex-* - OpenAI Responses:
gpt-5-*,gpt-*(containingcodexorpro) - Anthropic:
claude-* - Gemini:
gemini-* - xAI:
grok-* - DeepSeek:
deepseek-* - Moonshot:
moonshot-* - Zai:
glm-* - Cohere:
command-*,embed-* - Mimo:
mimo-* - OpenCode Go: Namespace
opencode_go::only - Fireworks: Models containing
fireworks - Ollama: Fallback for any other names, defaulting to local Ollama.
Namespacing (Forcing an Adapter)
You can force a specific adapter by using the adapter_kind::model_name syntax. This is the recommended way for many providers and for disambiguating OpenAI-compatible services.
groq::openai/gpt-oss-20b(Forces Groq adapter)together::meta-llama/Llama-3-8b-chat-hf(Forces Together adapter)fireworks::glm-5p1(for fireworks.ai)ollama_cloud::gemma3:4b(Forces Ollama Cloud adapter)github_copilot::openai/gpt-5.4-mini(Forces GitHub Copilot adapter)nebius::Qwen/Qwen3-235B-A22B(Forces Nebius adapter)aliyun::qwen-plus(Forces Aliyun adapter)vertex::gemini-2.5-flash(Forces Google Vertex adapter)moonshot::moonshot-v1-8k(Forces Moonshot adapter)baidu::ernie-4.0(Forces Baidu adapter)zai_coding::glm-4.6(Special namespace for Zai coding subscription)zai_coding::glm-4.6(Special namespace for Zai coding subscription)opencode_go::minimax-m2.5(Forces OpenCode Go adapter)bedrock_api::anthropic.claude-v2(Forces AWS Bedrock adapter)open_router::google/gemini-2.0-flash-001(Forces OpenRouter adapter)
For a complete list of AdapterKind, see the AdapterKind enum.
Examples
//! Base examples demonstrating the core capabilities of genai
use ;
use ;
use Client;
const MODEL_OPENAI: &str = "gpt-5.4-mini";
const MODEL_ANTHROPIC: &str = "claude-haiku-4-5";
const MODEL_FIREWORKS: &str = "fireworks::gpt-oss-20b";
const MODEL_TOGETHER: &str = "together::openai/gpt-oss-20b";
const MODEL_GEMINI: &str = "gemini-3-flash-preview";
const MODEL_GROQ: &str = "groq::openai/gpt-oss-20b";
const MODEL_OLLAMA: &str = "gemma4:e2b"; // sh: `ollama pull gemma:2b`
const MODEL_OLLAMA_CLOUD: &str = "ollama_cloud::gemma3:4b";
const MODEL_XAI: &str = "grok-3-mini";
const MODEL_DEEPSEEK: &str = "deepseek-chat";
const MODEL_ZAI: &str = "glm-4-plus";
const MODEL_ALIYUN: &str = "aliyun::qwen-plus"; // required namespace
// or any publisher: "github_copilot::anthropic/claude-sonnet-4-6", "github_copilot::google/gemini-2.5-pro", "github_copilot::xai/grok-3-mini"
const MODEL_GITHUB_COPILOT: &str = "github_copilot::openai/gpt-4.1-mini";
// NOTE: These are the default environment keys for each AI adapter type.
// They can be customized; see `examples/c02-auth.rs`.
const MODEL_AND_KEY_ENV_NAME_LIST: & = &;
// NOTE: Model to AdapterKind (AI provider) type mapping rule
// - starts_with "gpt" -> OpenAI (or OpenAI Responses for gpt-5/codex/pro)
// - starts_with "claude" -> Anthropic
// - starts_with "command" -> Cohere
// - starts_with "gemini" -> Gemini
// - model in Groq models -> Groq
// - starts_with "glm" -> ZAI
// - starts_with "ollama_cloud::" -> OllamaCloud
// - For anything else -> Ollama
//
// This can be customized; see `examples/c03-mapper.rs`
async
More Examples
- examples/c00-readme.rs - Quick overview code with multiple providers and streaming.
- examples/c01-conv.rs - Shows how to build a conversation flow.
- examples/c02-auth.rs - Demonstrates how to provide a custom
AuthResolverto supply auth data, such asapi_key, per adapter kind. - examples/c03-mapper.rs - Demonstrates how to provide a custom
AdapterKindResolverto customize the "model name" to "adapter kind" mapping. - examples/c04-chat-options.rs - Demonstrates how to set chat generation options such as
temperatureandmax_tokensat the client level, for all requests, and at the per-request level. - examples/c05-model-names.rs - Shows how to get model names per
AdapterKind. - examples/c06-target-resolver.rs - For custom auth, endpoint, and model.
- examples/c07-image.rs - Image analysis support
-
genai live coding, code design, and best practices
- Adding Gemini Structured Output (vid-0060)
- Adding OpenAI Structured Output (vid-0059)
- Splitting the JSON value extension trait into its own public crate, value-ext value-ext
- (part 1/3) Module, Error, constructors/builders
- (part 2/3) Extension Traits, Project Files, Versioning
- (part 3/3) When to Async? Project Files, Versioning strategy
Library Focus:
-
Focuses on standardizing chat completion APIs across major AI providers while preserving provider-specific strengths.
-
Native implementation without per-service SDK dependencies.
- Reason: genai uses each provider's native protocol when available, so features such as reasoning controls, thinking budgets, streaming metadata, and multimodal inputs can be represented more completely. When a provider primarily exposes an OpenAI-compatible API, genai uses that compatibility layer instead. Managing these protocol differences at the adapter layer is simpler and more scalable than dealing with multiple SDKs.
-
Prioritizes ergonomics and commonality, while depth is secondary. (If you require a complete client API, consider using async-openai and ollama-rs; both are excellent and easy to use.)
-
This library focuses on text chat, vision, and function calling APIs. (If you require a complete client API, consider using async-openai and ollama-rs; both are excellent and easy to use.)
ChatOptions
- (1) - OpenAI-compatible notes
- Models: OpenAI, DeepSeek, Groq, Ollama, xAI, Mimo, Together, Fireworks, Nebius, Zai, AIHubMix
| Property | OpenAI Compatibles (*1) | Anthropic | Gemini generationConfig. |
Cohere |
|---|---|---|---|---|
temperature |
temperature |
temperature |
temperature |
temperature |
max_tokens |
max_tokens |
max_tokens (default 1024) |
maxOutputTokens |
max_tokens |
top_p |
top_p |
top_p |
topP |
p |
Usage
| Property | OpenAI Compatibles (1) | Anthropic usage. |
Gemini usageMetadata. |
Cohere meta.tokens. |
|---|---|---|---|---|
prompt_tokens |
prompt_tokens |
input_tokens (added) |
promptTokenCount (2) |
input_tokens |
completion_tokens |
completion_tokens |
output_tokens (added) |
candidatesTokenCount (2) |
output_tokens |
total_tokens |
total_tokens |
(computed) | totalTokenCount (2) |
(computed) |
prompt_tokens_details |
prompt_tokens_details |
cached/cache_creation |
N/A for now | N/A for now |
completion_tokens_details |
completion_tokens_details |
N/A for now | N/A for now | N/A for now |
-
(1) - OpenAI-compatible notes
- Models: OpenAI, DeepSeek, Groq, Ollama, xAI, Mimo, AIHubMix
- For Groq, the property is
x_groq.usage. - At this point, Ollama does not emit input/output tokens when streaming due to a limitation in the Ollama OpenAI compatibility layer. (see ollama #4448 - Streaming Chat Completion via OpenAI API should support stream option to include Usage)
prompt_tokens_detailsandcompletion_tokens_detailswill have the value sent by the compatible provider, orNone
-
(2): Gemini tokens
- Right now, with the Gemini Stream API, it’s not clear whether usage for each event is cumulative or must be summed. It appears to be cumulative, meaning the last message shows the total number of input, output, and total tokens, so that is the current assumption. See possible tweet answer for more info.
Usage examples
-
AIPack - Check out AIPack, which wraps this genai library into an agentic runtime to run, build, and share AI Agent Packs. See
pro@coderfor a simple example of how I use AI PACK/genai for production coding. -
zcoder - I am also in the process of building zcoder, which will be a parallel-first coding harness.
Note: Feel free to send me a short description and a link to your application or library that uses genai. I'm happy to add it.
Links
- crates.io: crates.io/crates/genai
- GitHub: github.com/jeremychone/rust-genai