Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
adk-model
LLM model integrations for Rust Agent Development Kit (ADK-Rust) with Gemini, OpenAI, xAI, Anthropic, DeepSeek, Groq, Ollama, Fireworks AI, Together AI, Mistral AI, Perplexity, Cerebras, SambaNova, Amazon Bedrock, and Azure AI Inference.
Overview
adk-model provides LLM integrations for the Rust Agent Development Kit (ADK-Rust). Supports all major providers:
- Gemini - Google's Gemini models (3 Pro, 3 Flash, 2.5 Pro, 2.5 Flash, etc.)
- OpenAI - GPT-5.1, GPT-5, GPT-5 Mini, GPT-4o (legacy)
- xAI - Grok models through the OpenAI-compatible API
- Anthropic - Claude Opus 4.5, Claude Sonnet 4.5, Claude Haiku 4.5, Claude 4
- DeepSeek - DeepSeek R1, DeepSeek V3.1, DeepSeek-Chat with thinking mode
- Groq - Ultra-fast inference (LLaMA 3.3, Mixtral, Gemma)
- Ollama - Local LLMs (LLaMA, Mistral, Qwen, Gemma, etc.)
- Fireworks AI - Fast open-model inference (Llama, Mixtral, etc.)
- Together AI - Hosted open models (Llama, CodeLlama, etc.)
- Mistral AI - Mistral cloud models (Mistral Small, Large, etc.)
- Perplexity - Search-augmented LLM (Sonar, etc.)
- Cerebras - Ultra-fast inference (Llama 3.3, etc.)
- SambaNova - Fast inference (Llama 3.3, etc.)
- Amazon Bedrock - AWS-hosted models via IAM auth (Claude, Llama, Mistral, etc.)
- Azure AI Inference - Azure-hosted models (Cohere, Llama, Mistral, etc.)
- Streaming - Real-time response streaming for all providers
- Multimodal - Text, images, audio, video, and PDF input
The crate implements the Llm trait from adk-core, allowing models to be used interchangeably.
Installation
[]
= "0.3.2"
Or use the meta-crate:
[]
= { = "0.3.2", = ["models"] }
Quick Start
Gemini (Google)
use GeminiModel;
use LlmAgentBuilder;
use Arc;
async
OpenAI
use ;
use LlmAgentBuilder;
use Arc;
async
Anthropic (Claude)
use ;
use LlmAgentBuilder;
use Arc;
async
Anthropic Advanced Features
use ;
// Extended thinking with token budget
let config = new
.with_thinking
.with_prompt_caching
.with_beta_feature;
let client = new?;
// Token counting
let count = client.count_tokens.await?;
// Model discovery
let models = client.list_models.await?;
let info = client.get_model.await?;
// Rate limit inspection
let rate_info = client.latest_rate_limit_info.await;
DeepSeek
use ;
use LlmAgentBuilder;
use Arc;
async
Groq (Ultra-Fast)
use ;
use LlmAgentBuilder;
use Arc;
async
Ollama (Local)
use ;
use LlmAgentBuilder;
use Arc;
async
Fireworks AI
use ;
use LlmAgentBuilder;
use Arc;
async
Together AI
use ;
use LlmAgentBuilder;
use Arc;
async
Mistral AI
use ;
use LlmAgentBuilder;
use Arc;
async
Perplexity
use ;
use LlmAgentBuilder;
use Arc;
async
Cerebras
use ;
use LlmAgentBuilder;
use Arc;
async
SambaNova
use ;
use LlmAgentBuilder;
use Arc;
async
Amazon Bedrock
use ;
use LlmAgentBuilder;
use Arc;
async
Azure AI Inference
use ;
use LlmAgentBuilder;
use Arc;
async
Supported Models
Google Gemini
| Model | Description |
|---|---|
gemini-3.1-pro |
Most intelligent AI model, enhancing reasoning and multimodal capabilities. (1M context) |
gemini-3-pro |
Intelligent model for complex agentic workflows (1M context) |
gemini-3-flash |
Fast and efficient for most tasks (1M context) |
gemini-2.5-pro |
Advanced reasoning and multimodal understanding |
gemini-2.5-flash |
Balanced speed and capability (recommended) |
gemini-2.5-flash-lite |
Ultra-fast for high-volume tasks |
gemini-2.0-flash |
Previous generation (retiring March 2026) |
See Gemini models documentation for the full list.
OpenAI
| Model | Description |
|---|---|
gpt-5.1 |
Latest iteration with improved performance (256K context) |
gpt-5 |
State-of-the-art unified model with adaptive thinking |
gpt-5-mini |
Efficient version for most tasks (128K context) |
gpt-4o |
Multimodal model (deprecated August 2025) |
gpt-4o-mini |
Fast and affordable (deprecated August 2025) |
See OpenAI models documentation for the full list.
Anthropic Claude
| Model | Description |
|---|---|
claude-opus-4-5-20251101 |
Most capable model for complex autonomous tasks (200K context) |
claude-sonnet-4-5-20250929 |
Balanced intelligence and cost for production (1M context) |
claude-haiku-4-5-20251001 |
Ultra-efficient for high-volume workloads |
claude-opus-4-20250514 |
Hybrid model with extended thinking |
claude-sonnet-4-20250514 |
Balanced model with extended thinking |
See Anthropic models documentation for the full list.
DeepSeek
| Model | Description |
|---|---|
deepseek-r1-0528 |
Latest reasoning model with enhanced thinking depth (128K context) |
deepseek-r1 |
Advanced reasoning comparable to o1 |
deepseek-v3.1 |
Latest 671B MoE model for general tasks |
deepseek-chat |
671B MoE model, excellent for code (V3) |
deepseek-vl2 |
Vision-language model (32K context) |
Features:
- Thinking Mode - Chain-of-thought reasoning with
<thinking>tags - Context Caching - Automatic KV cache for repeated prefixes (10x cost reduction)
- Tool Calling - Full function calling support
See DeepSeek API documentation for the full list.
Groq
| Model | Description |
|---|---|
llama-4-scout |
Llama 4 Scout (17Bx16E) - Fast via Groq LPU (128K context) |
llama-3.2-90b-text-preview |
Large text model |
llama-3.2-11b-text-preview |
Balanced text model |
llama-3.1-70b-versatile |
Versatile large model |
llama-3.1-8b-instant |
Ultra-fast instruction model |
mixtral-8x7b-32768 |
MoE model with 32K context |
Features:
- Ultra-Fast - LPU-based inference (fastest in the industry)
- Tool Calling - Full function calling support
- Large Context - Up to 128K tokens
See Groq documentation for the full list.
Ollama (Local)
| Model | Description |
|---|---|
llama3.3:70b |
Llama 3.3 70B - Latest for local deployment (128K context) |
llama3.2:3b |
Efficient small model |
llama3.1:8b |
Popular balanced model |
deepseek-r1:14b |
Distilled reasoning model |
deepseek-r1:32b |
Larger distilled reasoning model |
qwen3:14b |
Strong multilingual and coding |
qwen2.5:7b |
Efficient multilingual model (recommended for tool calling) |
mistral:7b |
Fast and capable |
mistral-nemo:12b |
Enhanced Mistral variant (128K context) |
gemma3:9b |
Google's efficient open model |
devstral:24b |
Optimized for coding tasks |
codellama:13b |
Code-focused Llama variant |
Features:
- Local Inference - No API key required
- Privacy - Data stays on your machine
- Tool Calling - Full function calling support (uses non-streaming for reliability)
- MCP Integration - Connect to MCP servers for external tools
See Ollama library for all available models.
New Providers
| Provider | Feature Flag | Default Model | API Key Env Var |
|---|---|---|---|
| Fireworks AI | fireworks |
accounts/fireworks/models/llama-v3p1-8b-instruct |
FIREWORKS_API_KEY |
| Together AI | together |
meta-llama/Llama-3.3-70B-Instruct-Turbo |
TOGETHER_API_KEY |
| Mistral AI | mistral |
mistral-small-latest |
MISTRAL_API_KEY |
| Perplexity | perplexity |
sonar |
PERPLEXITY_API_KEY |
| Cerebras | cerebras |
llama-3.3-70b |
CEREBRAS_API_KEY |
| SambaNova | sambanova |
Meta-Llama-3.3-70B-Instruct |
SAMBANOVA_API_KEY |
| Amazon Bedrock | bedrock |
anthropic.claude-sonnet-4-20250514-v1:0 |
AWS IAM credentials |
| Azure AI Inference | azure-ai |
(endpoint-specific) | AZURE_AI_API_KEY |
Features
- Streaming - Real-time response streaming for all providers
- Tool Calling - Function calling support across all providers
- Async - Full async/await support with backpressure
- Retry - Automatic retry with exponential backoff
- Generation Config - Temperature, top_p, top_k, max_tokens
Environment Variables
# Google Gemini
GOOGLE_API_KEY=your-google-api-key
# OpenAI
OPENAI_API_KEY=your-openai-api-key
# xAI
XAI_API_KEY=your-xai-api-key
# Anthropic
ANTHROPIC_API_KEY=your-anthropic-api-key
# DeepSeek
DEEPSEEK_API_KEY=your-deepseek-api-key
# Groq
GROQ_API_KEY=your-groq-api-key
# Fireworks AI
FIREWORKS_API_KEY=your-fireworks-api-key
# Together AI
TOGETHER_API_KEY=your-together-api-key
# Mistral AI
MISTRAL_API_KEY=your-mistral-api-key
# Perplexity
PERPLEXITY_API_KEY=your-perplexity-api-key
# Cerebras
CEREBRAS_API_KEY=your-cerebras-api-key
# SambaNova
SAMBANOVA_API_KEY=your-sambanova-api-key
# Azure AI Inference
AZURE_AI_API_KEY=your-azure-ai-api-key
# Amazon Bedrock (uses AWS IAM credentials)
AWS_ACCESS_KEY_ID=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-key
AWS_REGION=us-east-1
# Ollama (no key needed, just start the server)
# ollama serve
Feature Flags
Enable specific providers with feature flags:
[]
# All providers (default)
= { = "0.3.2", = ["all-providers"] }
# Individual providers
= { = "0.3.2", = ["gemini"] }
= { = "0.3.2", = ["openai"] }
= { = "0.3.2", = ["xai"] }
= { = "0.3.2", = ["anthropic"] }
= { = "0.3.2", = ["deepseek"] }
= { = "0.3.2", = ["groq"] }
= { = "0.3.2", = ["ollama"] }
= { = "0.3.2", = ["fireworks"] }
= { = "0.3.2", = ["together"] }
= { = "0.3.2", = ["mistral"] }
= { = "0.3.2", = ["perplexity"] }
= { = "0.3.2", = ["cerebras"] }
= { = "0.3.2", = ["sambanova"] }
= { = "0.3.2", = ["bedrock"] }
= { = "0.3.2", = ["azure-ai"] }
Related Crates
- adk-rust - Meta-crate with all components
- adk-core - Core
Llmtrait - adk-agent - Agent implementations
License
Apache-2.0
Part of ADK-Rust
This crate is part of the ADK-Rust framework for building AI agents in Rust.