genai - Multi-AI Providers Library for Rust
Currently supports natively: DeepSeek (deepseek.com & Groq), OpenAI, Anthropic, Groq, Ollama, Gemini, Cohere (more to come)
# cargo.toml
= "0.1.21"
Provides a common and ergonomic single API to many generative AI providers, such as Anthropic, OpenAI, Gemini, xAI, Ollama, Groq, and more.
Check out devai.run, the Iterate to Automate command-line application that leverages genai for multi-AI capabilities.
Key Features
- DeepSeekR1 support, with
reasoning_content(and stream support) + DeepSeek Groq and Ollama support (andreasoning_contentnormalization) - Native Multi-AI Provider/Model: OpenAI, Anthropic, Gemini, Ollama, Groq, xAI, DeepSeek (Direct chat and stream) (see examples/c00-readme.rs)
- Image Analysis (for OpenAI, Gemini flash-2, Anthropic) (see examples/c07-image.rs)
- Custom Auth/API Key (see examples/c02-auth.rs)
- Model Alias (see examples/c05-model-names.rs)
- Custom Endpoint, Auth, and Model Identifier (see examples/c06-target-resolver.rs)
Examples | Thanks | Library Focus | Changelog | Provider Mapping: ChatOptions | MetaUsage
Examples
//! Base examples demonstrating the core capabilities of genai
use ;
use ;
use Client;
const MODEL_OPENAI: &str = "gpt-4o-mini"; // o1-mini, gpt-4o-mini
const MODEL_ANTHROPIC: &str = "claude-3-haiku-20240307";
const MODEL_COHERE: &str = "command-light";
const MODEL_GEMINI: &str = "gemini-1.5-flash-latest";
const MODEL_GROQ: &str = "llama3-8b-8192";
const MODEL_OLLAMA: &str = "gemma:2b"; // sh: `ollama pull gemma:2b`
const MODEL_XAI: &str = "grok-beta";
const MODEL_DEEPSEEK: &str = "deepseek-chat";
// NOTE: These are the default environment keys for each AI Adapter Type.
// They can be customized; see `examples/c02-auth.rs`
const MODEL_AND_KEY_ENV_NAME_LIST: & = &;
// NOTE: Model to AdapterKind (AI Provider) type mapping rule
// - starts_with "gpt" -> OpenAI
// - starts_with "claude" -> Anthropic
// - starts_with "command" -> Cohere
// - starts_with "gemini" -> Gemini
// - model in Groq models -> Groq
// - For anything else -> Ollama
//
// This can be customized; see `examples/c03-mapper.rs`
async
More Examples
- examples/c00-readme.rs - Quick overview code with multiple providers and streaming.
- examples/c01-conv.rs - Shows how to build a conversation flow.
- examples/c02-auth.rs - Demonstrates how to provide a custom
AuthResolverto provide auth data (i.e., for api_key) per adapter kind. - examples/c03-mapper.rs - Demonstrates how to provide a custom
AdapterKindResolverto customize the "model name" to "adapter kind" mapping. - examples/c04-chat-options.rs - Demonstrates how to set chat generation options such as
temperatureandmax_tokensat the client level (for all requests) and per request level. - examples/c05-model-names.rs - Shows how to get model names per AdapterKind.
- examples/c06-target-resolver.rs - For custom Auth, Endpoint, and Model.
- examples/c07-image.rs - Image Analysis support
-
genai live coding, code design, & best practices
- Adding Gemini Structured Output (vid-0060)
- Adding OpenAI Structured Output (vid-0059)
- Splitting the json value extension trait to its own public crate value-ext value-ext
- (part 1/3) Module, Error, constructors/builders
- (part 2/3) Extension Traits, Project Files, Versioning
- (part 3/3) When to Async? Project Files, Versioning strategy
Thanks
- Thanks to @AdamStrojek for initial image support PR #36
- Thanks to @semtexzv for
stop_sequencesAnthropic support PR #34 - Thanks to @omarshehab221 for de/serialize on structs PR #19
- Thanks to @tusharmath for make webc::Error PR #12
- Thanks to @giangndm for make stream is send PR #10
- Thanks to @stargazing-dino for PR #2 - implement Groq completions
Library Focus:
-
Focuses on standardizing chat completion APIs across major AI services.
-
Native implementation, meaning no per-service SDKs.
- Reason: While there are some variations between all of the various APIs, they all follow the same pattern and high-level flow and constructs. Managing the differences at a lower layer is actually simpler and more cumulative across services than doing SDKs gymnastics.
-
Prioritizes ergonomics and commonality, with depth being secondary. (If you require a complete client API, consider using async-openai and ollama-rs; they are both excellent and easy to use.)
-
Initially, this library will mostly focus on text chat API (images, or even function calling in the first stage).
-
The
0.1.xversion will work, but the APIs will change in the patch version, not following semver strictly. -
Version
0.2.xwill follow semver more strictly.
ChatOptions
- (1) - OpenAI compatibles notes
- Models: OpenAI, DeepSeek, Groq, Ollama, xAI
| Property | OpenAI Compatibles (*1) | Anthropic | Gemini generationConfig. |
Cohere |
|---|---|---|---|---|
temperature |
temperature |
temperature |
temperature |
temperature |
max_tokens |
max_tokens |
max_tokens (default 1024) |
maxOutputTokens |
max_tokens |
top_p |
top_p |
top_p |
topP |
p |
MetaUsage
| Property | OpenAI Compatibles (1) | Anthropic usage. |
Gemini usageMetadata. |
Cohere meta.tokens. |
|---|---|---|---|---|
prompt_tokens |
prompt_tokens |
input_tokens (added) |
promptTokenCount (2) |
input_tokens |
completion_tokens |
completion_tokens |
output_tokens (added) |
candidatesTokenCount (2) |
output_tokens |
total_tokens |
total_tokens |
(computed) | totalTokenCount (2) |
(computed) |
prompt_tokens_details |
prompt_tokens_details |
N/A for now | N/A for now | N/A for now |
completion_tokens_details |
completion_tokens_details |
N/A for now | N/A for now | N/A for now |
-
(1) - OpenAI compatibles notes
- Models: OpenAI, DeepSeek, Groq, Ollama, xAI
- For Groq, the property
x_groq.usage. - At this point, Ollama does not emit input/output tokens when streaming due to the Ollama OpenAI compatibility layer limitation. (see ollama #4448 - Streaming Chat Completion via OpenAI API should support stream option to include Usage)
prompt_tokens_detailsandcompletion_tokens_detailswill have the value sent by the compatible provider (or None)
-
(2): Gemini tokens
- Right now, with Gemini Stream API, it's not really clear if the usage for each event is cumulative or needs to be added. Currently, it appears to be cumulative (i.e., the last message has the total amount of input, output, and total tokens), so that will be the assumption. See possible tweet answer for more info.
Notes on Possible Direction
- Will add more data on ChatResponse and ChatStream, especially metadata about usage.
- Add vision/image support to chat messages and responses.
- Add function calling support to chat messages and responses.
- Add
embedandembed_batch - Add the AWS Bedrock variants (e.g., Mistral, and Anthropic). Most of the work will be on "interesting" token signature scheme (without having to drag big SDKs, might be below feature).
- Add the Google VertexAI variants.
- (might) add the Azure OpenAI variant (not sure yet).
Links
- crates.io: crates.io/crates/genai
- GitHub: github.com/jeremychone/rust-genai
- Sponsored by BriteSnow (Jeremy Chone's consulting company)