LLMKit
Unified LLM API Client for Rust, Python, and Node.js
One interface for 100+ LLM providers. Rust core with bindings for every language.
┌──────────────┐
│ Rust Core │
└──────┬───────┘
┌──────────┬─────────┼─────────┬──────────┐
▼ ▼ ▼ ▼ ▼
┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐
│Python │ │ Node │ │ WASM │ │ Go │ │ ... │
│ ✅ │ │ ✅ │ │ Soon │ │ Soon │ │ │
└───────┘ └───────┘ └───────┘ └───────┘ └───────┘
📖 Documentation | 📝 Changelog | 🤝 Contributing | 🔒 Security
Quick Start
Rust
use ;
let client = from_env?;
let response = client.complete.await?;
println!;
Python
=
=
Node.js
import { JsLLMKitClient as LLMKitClient, JsMessage as Message, JsCompletionRequest as CompletionRequest } from 'llmkit'
const client = LLMKitClient.fromEnv()
const response = await client.complete(CompletionRequest.create('groq/llama-3.3-70b-versatile', [Message.user('Hello!')]))
console.log(response.textContent())
Why LLMKit?
- 🌍 100+ Providers - OpenAI, Anthropic, Google, AWS Bedrock, Azure, Groq, Mistral, and more
- 🔄 Unified API - Same interface for all providers with
provider/modelformat - ⚡ Streaming - First-class async streaming support
- 🛠️ Tool Calling - Abstract tool definitions with builder pattern
- 🧠 Extended Thinking - Reasoning mode across 4 providers (OpenAI, Anthropic, Google, DeepSeek)
- 🦀 Pure Rust - Memory-safe, high performance core
Features
| Chat | Media | Specialized |
|---|---|---|
| Streaming | Image Generation | Embeddings |
| Tool Calling | Vision/Images | Token Counting |
| Structured Output | Audio STT/TTS | Batch Processing |
| Extended Thinking | Video Generation | Model Registry |
| Prompt Caching | 11,000+ Models |
Installation
Rust
[]
= { = "0.1", = ["anthropic", "openai"] }
Python
Node.js
Providers
| Category | Providers |
|---|---|
| Core | Anthropic, OpenAI, Azure OpenAI |
| Cloud | AWS Bedrock, Google Vertex AI, Google AI |
| Fast Inference | Groq, Mistral, Cerebras, SambaNova, Fireworks, DeepSeek |
| Enterprise | Cohere, AI21 |
| Hosted | Together, Perplexity, DeepInfra, OpenRouter |
| Local | Ollama, LM Studio, vLLM |
| Audio | Deepgram, ElevenLabs |
| Video | Runware |
See PROVIDERS.md for full list with environment variables.
Examples
Streaming
let mut stream = client.complete_stream.await?;
while let Some = stream.next.await
Tool Calling
=
=
Extended Thinking
const request = CompletionRequest.create('deepseek/deepseek-reasoner', messages).withThinking(5000)
const response = await client.complete(request)
console.log(response.thinkingContent()) // Reasoning process
For more examples, see examples/.
Building from Source
# Python bindings
&&
# Node.js bindings
&& &&
Documentation
- Getting Started (Rust)
- Getting Started (Python)
- Getting Started (Node.js)
- Model Registry - 11,000+ models with pricing
Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
&& && &&
License
Dual-licensed under MIT or Apache-2.0 at your option.
Built with 🦀 Rust | Status: Production Ready | v0.1.0 | 100+ Providers