LLM
Note: This crate name previously belonged to another project. The current implementation represents a new and different library. The previous crate is now archived and will not receive any updates. ref: https://github.com/rustformers/llm
LLM is a Rust library that lets you use multiple LLM backends in a single project: OpenAI, Anthropic (Claude), Ollama, DeepSeek, xAI, Phind, Groq, Google and ElevenLabs. With a unified API and builder style - similar to the Stripe experience - you can easily create chat, text completion, speak-to-text requests without multiplying structures and crates.
Key Features
- Multi-backend: Manage OpenAI, Anthropic, Ollama, DeepSeek, xAI, Phind, Groq and Google through a single entry point.
- Multi-step chains: Create multi-step chains with different backends at each step.
- Templates: Use templates to create complex prompts with variables.
- Builder pattern: Configure your LLM (model, temperature, max_tokens, timeouts...) with a few simple calls.
- Chat & Completions: Two unified traits (
ChatProvider
andCompletionProvider
) to cover most use cases. - Extensible: Easily add new backends.
- Rust-friendly: Designed with clear traits, unified error handling, and conditional compilation via features.
- Validation: Add validation to your requests to ensure the output is what you expect.
- Evaluation: Add evaluation to your requests to score the output of LLMs.
- Parallel Evaluation: Evaluate multiple LLM providers in parallel and select the best response based on scoring functions.
- Function calling: Add function calling to your requests to use tools in your LLMs.
- REST API: Serve any LLM backend as a REST API with openai standard format.
- Vision: Add vision to your requests to use images in your LLMs.
- Reasoning: Add reasoning to your requests to use reasoning in your LLMs.
- Structured Output: Request structured output from certain LLM providers based on a provided JSON schema.
- Speech to text: Transcribe audio to text
- Text to speech: Transcribe text to audio
- Memory: Store and retrieve conversation history with sliding window (soon others) and shared memory support
- Agentic: Build reactive agents that can cooperate via shared memory, with configurable triggers, roles and validation.
Use any LLM backend on your project
Simply add LLM to your Cargo.toml
:
[]
= { = "1.2.4", = ["openai", "anthropic", "ollama", "deepseek", "xai", "phind", "google", "groq", "Elevenlabs"] }
Use any LLM on cli
LLM includes a command-line tool for easily interacting with different LLM models. You can install it with: cargo install llm
- Use
llm
to start an interactive chat session - Use
llm openai:gpt-4o
to start an interactive chat session with provider:model - Use
llm set OPENAI_API_KEY your_key
to configure your API key - Use
llm default openai:gpt-4
to set a default provider - Use
echo "Hello World" | llm
to pipe - Use
llm --provider openai --model gpt-4 --temperature 0.7
for advanced options
Serving any LLM backend as a REST API
- Use standard messages format
- Use step chains to chain multiple LLM backends together
- Expose the chain through a REST API with openai standard format
[dependencies]
llm = { version = "1.2.4", features = ["openai", "anthropic", "ollama", "deepseek", "xai", "phind", "google", "groq", "api", "elevenlabs"] }
More details in the api_example
More examples
Name | Description |
---|---|
anthropic_example |
Demonstrates integration with Anthropic's Claude model for chat completion |
anthropic_streaming_example |
Anthropic streaming chat example demonstrating real-time token generation |
chain_example |
Shows how to create multi-step prompt chains for exploring programming language features |
deepseek_example |
Basic DeepSeek chat completion example with deepseek-chat models |
embedding_example |
Basic embedding example with OpenAI's API |
multi_backend_example |
Illustrates chaining multiple LLM backends (OpenAI, Anthropic, DeepSeek) together in a single workflow |
ollama_example |
Example of using local LLMs through Ollama integration |
openai_example |
Basic OpenAI chat completion example with GPT models |
openai_streaming_example |
OpenAI streaming chat example demonstrating real-time token generation |
phind_example |
Basic Phind chat completion example with Phind-70B model |
validator_example |
Basic validator example with Anthropic's Claude model |
xai_example |
Basic xAI chat completion example with Grok models |
xai_streaming_example |
X.AI streaming chat example demonstrating real-time token generation |
evaluation_example |
Basic evaluation example with Anthropic, Phind and DeepSeek |
evaluator_parallel_example |
Evaluate multiple LLM providers in parallel |
google_example |
Basic Google Gemini chat completion example with Gemini models |
google_streaming_example |
Google streaming chat example demonstrating real-time token generation |
google_pdf |
Google Gemini chat with PDF attachment |
google_image |
Google Gemini chat with PDF attachment |
google_embedding_example |
Basic Google Gemini embedding example with Gemini models |
tool_calling_example |
Basic tool calling example with OpenAI |
google_tool_calling_example |
Google Gemini function calling example with complex JSON schema for meeting scheduling |
json_schema_nested_example |
Advanced example demonstrating deeply nested JSON schemas with arrays of objects and complex data structures |
tool_json_schema_cycle_example |
Complete tool calling cycle with JSON schema validation and structured responses |
unified_tool_calling_example |
Unified tool calling with selectable provider - demonstrates multi-turn tool use and tool choice |
deepclaude_pipeline_example |
Basic deepclaude pipeline example with DeepSeek and Claude |
api_example |
Basic API (openai standard format) example with OpenAI, Anthropic, DeepSeek and Groq |
api_deepclaude_example |
Basic API (openai standard format) example with DeepSeek and Claude |
anthropic_vision_example |
Basic anthropic vision example with Anthropic |
openai_vision_example |
Basic openai vision example with OpenAI |
openai_reasoning_example |
Basic openai reasoning example with OpenAI |
anthropic_thinking_example |
Anthropic reasoning example |
elevenlabs_stt_example |
Speech-to-text transcription example using ElevenLabs |
elevenlabs_tts_example |
Text-to-speech example using ElevenLabs |
openai_stt_example |
Speech-to-text transcription example using OpenAI |
openai_tts_example |
Text-to-speech example using OpenAI |
tts_rodio_example |
Text-to-speech with rodio example using OpenAI |
chain_audio_text_example |
Example demonstrating a multi-step chain combining speech-to-text and text processing |
xai_search_chain_tts_example |
Example demonstrating a multi-step chain combining XAI search, OpenAI summarization, and ElevenLabs text-to-speech with Rodio playback |
xai_search_example |
Example demonstrating X.AI search functionality with search modes, date ranges, and source filtering |
memory_example |
Automatic memory integration - LLM remembers conversation context across calls |
memory_share_example |
Example demonstrating shared memory between multiple LLM providers |
trim_strategy_example |
Example demonstrating memory trimming strategies with automatic summarization |
agent_builder_example |
Example of reactive agents cooperating via shared memory, demonstrating creation of LLM agents with roles, conditions |
openai_web_search_example |
Example demonstrating OpenAI web search functionality with location-based search context |
Usage
Here's a basic example using OpenAI for chat completion. See the examples directory for other backends (Anthropic, Ollama, DeepSeek, xAI, Google, Phind, Elevenlabs), embedding capabilities, and more advanced use cases.