# llm-stack
<div align="center">
**Production-ready Rust SDK for LLM providers**
[](https://github.com/nazq/llm-stack/actions/workflows/ci.yml)
[](https://crates.io/crates/llm-stack)
[](https://docs.rs/llm-stack)
[](https://www.rust-lang.org)
[](LICENSE)
[Quick Start](#quick-start) โข
[Features](#features) โข
[Documentation](#documentation) โข
[Examples](#examples) โข
[Contributing](#contributing)
</div>
---
## Overview
llm-stack is a unified Rust interface for building LLM-powered applications. Write against one set of types, swap providers without changing application code.
```rust
use llm_stack::{ChatMessage, ChatParams, Provider};
use llm_stack_anthropic::AnthropicProvider;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let provider = AnthropicProvider::from_env()?;
let response = provider.generate(&ChatParams {
messages: vec![ChatMessage::user("What is Rust's ownership model?")],
max_tokens: Some(1024),
..Default::default()
}).await?;
println!("{}", response.text().unwrap_or_default());
Ok(())
}
```
### Why llm-stack?
- **๐ Provider Agnostic** โ Same code works with Anthropic, OpenAI, Ollama, or any custom provider
- **๐ ๏ธ Batteries Included** โ Tool execution, structured output, streaming, retry logic out of the box
- **๐ฏ Type Safe** โ `generate_object::<T>()` returns `T`, not `serde_json::Value`
- **โก Production Ready** โ Comprehensive error handling, token tracking, cost calculation
- **๐งช Testable** โ `MockProvider` for unit tests, no network mocks needed
- **๐ฆ Rust Native** โ Async/await, strong typing, zero-cost abstractions
---
## Quick Start
Add to your `Cargo.toml`:
```toml
[dependencies]
llm-stack = "0.1"
llm-stack-anthropic = "0.1" # or llm-stack-openai, llm-stack-ollama
tokio = { version = "1", features = ["rt-multi-thread", "macros"] }
```
Set your API key:
```bash
export ANTHROPIC_API_KEY="sk-ant-..."
# or OPENAI_API_KEY for OpenAI
```
Run:
```rust
use llm_stack::{ChatMessage, ChatParams, Provider};
use llm_stack_anthropic::AnthropicProvider;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let provider = AnthropicProvider::from_env()?;
let response = provider.generate(&ChatParams {
messages: vec![ChatMessage::user("Hello!")],
..Default::default()
}).await?;
println!("{}", response.text().unwrap_or_default());
println!("Tokens: {} in, {} out",
response.usage.input_tokens,
response.usage.output_tokens);
Ok(())
}
```
---
## Features
### Core Capabilities
| **Unified Provider Trait** | Two methods define a provider: `generate()` + `stream()` |
| **Streaming** | First-class async streaming with `StreamEvent` types |
| **Tool Execution** | Register handlers, validate inputs, execute in agentic loops |
| **Resumable Tool Loop** | Caller-driven iteration with inspect/inject/stop between rounds |
| **Structured Output** | `generate_object::<T>()` with JSON Schema validation |
| **Interceptors** | Composable retry, timeout, logging, approval gates |
| **Usage Tracking** | Token counts, cost calculation in microdollars |
| **Context Management** | Token budget tracking, message truncation |
### Provider Support
| **Anthropic** | `llm-stack-anthropic` | Claude 3.5 Sonnet, Claude 3 Opus/Haiku |
| **OpenAI** | `llm-stack-openai` | GPT-4o, GPT-4 Turbo, GPT-3.5 |
| **Ollama** | `llm-stack-ollama` | Llama 3, Mistral, CodeLlama, any local model |
### Tool Execution Engine
Build agentic applications with the tool loop:
```rust
use llm_stack::{
ChatParams, ChatMessage, ToolRegistry, ToolLoopConfig,
tool::{tool_fn, tool_loop},
ToolDefinition, JsonSchema,
};
use serde_json::json;
// Define a tool
let mut registry: ToolRegistry<()> = ToolRegistry::new();
registry.register(tool_fn(
ToolDefinition {
name: "get_weather".into(),
description: "Get current weather for a city".into(),
parameters: JsonSchema::new(json!({
"type": "object",
"properties": {
"city": { "type": "string" }
},
"required": ["city"]
})),
..Default::default()
},
|input| async move {
let city = input["city"].as_str().unwrap_or("unknown");
Ok(format!("Weather in {city}: 72ยฐF, sunny"))
},
));
// Run the agentic loop
let result = tool_loop(
&provider,
®istry,
ChatParams {
messages: vec![ChatMessage::user("What's the weather in Tokyo?")],
tools: Some(registry.definitions()),
..Default::default()
},
ToolLoopConfig::default(),
&(),
).await?;
println!("Final answer: {}", result.response.text().unwrap_or_default());
```
### Resumable Tool Loop
For orchestration patterns that need control between iterations (multi-agent, event injection, context compaction):
```rust
use llm_stack::{ToolLoopHandle, TurnResult};
let mut handle = ToolLoopHandle::new(
&provider, ®istry, params, ToolLoopConfig::default(), &(),
);
loop {
match handle.next_turn().await {
TurnResult::Yielded(turn) => {
// Text and tool results are directly available
if let Some(text) = turn.assistant_text() {
println!("LLM said: {text}");
}
turn.continue_loop(); // or inject_and_continue(), stop()
}
TurnResult::Completed(done) => {
println!("Done: {}", done.response.text().unwrap_or_default());
break;
}
TurnResult::Error(err) => {
eprintln!("Error: {}", err.error);
break;
}
}
}
```
See [Tool documentation](docs/tools.md#resumable-tool-loop) for the full API.
### Structured Output
Get typed responses with schema validation:
```rust
use llm_stack::structured::generate_object;
use serde::Deserialize;
#[derive(Deserialize, schemars::JsonSchema)]
struct MovieReview {
title: String,
rating: u8,
summary: String,
}
let review: MovieReview = generate_object(
&provider,
&ChatParams {
messages: vec![ChatMessage::user("Review the movie Inception")],
..Default::default()
},
).await?;
println!("{}: {}/10 - {}", review.title, review.rating, review.summary);
```
### Interceptors
Add cross-cutting concerns without modifying provider code:
```rust
use llm_stack::intercept::{InterceptorStack, Retry, Timeout, Logging};
use std::time::Duration;
let registry = ToolRegistry::new()
.with_interceptors(
InterceptorStack::new()
.with(Retry::new(3, Duration::from_millis(100)))
.with(Timeout::new(Duration::from_secs(30)))
.with(Logging::new())
);
```
---
## Documentation
| [Quick Start](docs/quickstart.md) | Get up and running in 5 minutes |
| [Architecture](docs/ARCH.md) | Design principles and module overview |
| [Providers](docs/providers.md) | Provider configuration and selection |
| [Tools](docs/tools.md) | Tool execution and agentic loops |
| [Structured Output](docs/structured-output.md) | Type-safe LLM responses |
| [Interceptors](docs/interceptors.md) | Retry, timeout, logging, approval |
| [Context Window](docs/context-window.md) | Token management and truncation |
| [Migration Guide](docs/migration-from-llm.md) | Coming from the `llm` crate? |
### API Reference
```bash
cargo doc --open
```
Or view on [docs.rs](https://docs.rs/llm-stack).
---
## Examples
### Streaming
```rust
use futures::StreamExt;
use llm_stack::stream::StreamEvent;
let mut stream = provider.stream(¶ms).await?;
while let Some(event) = stream.next().await {
match event? {
StreamEvent::TextDelta(text) => print!("{text}"),
StreamEvent::Done { stop_reason } => break,
_ => {}
}
}
```
### Multi-Provider Setup
```rust
use llm_stack::{ProviderRegistry, ProviderConfig};
let registry = ProviderRegistry::new()
.register("claude", AnthropicProvider::from_env()?)
.register("gpt4", OpenAiProvider::from_env()?)
.register("local", OllamaProvider::new("http://localhost:11434"));
// Select at runtime
let provider = registry.get("claude")?;
```
### Testing with MockProvider
```rust
use llm_stack::test_helpers::mock_for;
#[tokio::test]
async fn test_my_agent() {
let mock = mock_for("test", "mock-model");
mock.queue_response(ChatResponse {
content: vec![ContentBlock::Text("Hello!".into())],
..Default::default()
});
let response = mock.generate(¶ms).await.unwrap();
assert_eq!(response.text(), Some("Hello!"));
}
```
---
## Crate Map
| [`llm-stack`](crates/llm-stack) | Traits, types, errors, streaming, tools, interceptors |
| [`llm-stack-anthropic`](crates/llm-stack-anthropic) | Anthropic Claude provider |
| [`llm-stack-openai`](crates/llm-stack-openai) | OpenAI GPT provider |
| [`llm-stack-ollama`](crates/llm-stack-ollama) | Ollama local provider |
---
## Development
### Prerequisites
- **Rust 1.85+** (2024 edition)
- **[just](https://github.com/casey/just)** โ Command runner
### Commands
```bash
just gate # Full CI check: fmt + clippy + test + doc
just test # Run all tests
just clippy # Lint with warnings as errors
just doc # Build documentation
just fcheck # Quick feedback: fmt + check
```
### Running Tests
```bash
# All tests
just test
# Specific test
just test-one test_tool_loop
# With output
just test-verbose
```
---
## Contributing
We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
### Quick Checklist
- [ ] `just gate` passes (fmt, clippy, tests, docs)
- [ ] New features have tests
- [ ] Public APIs have documentation
- [ ] Commit messages follow [Conventional Commits](https://www.conventionalcommits.org/)
---
## License
Licensed under the Apache License, Version 2.0. See [LICENSE](LICENSE) for details.
---
## Acknowledgments
Built on the shoulders of giants:
- [Tokio](https://tokio.rs/) โ Async runtime
- [reqwest](https://github.com/seanmonstar/reqwest) โ HTTP client
- [serde](https://serde.rs/) โ Serialization
- [jsonschema](https://github.com/Stranger6667/jsonschema-rs) โ JSON Schema validation
---
<div align="center">
**[โฌ Back to Top](#llm-stack)**
</div>