Skip to main content

Crate llmkit

Crate llmkit 

Source
Expand description

§llmkit

A unified, async, multi-provider LLM client for Rust. One trait, one streaming API, one Tower middleware stack — across OpenAI, Anthropic, and local Ollama models, with no provider lock-in.

use llmkit::prelude::*;
use std::time::Duration;

let client = LlmClientBuilder::new()
    .provider(AnthropicProvider::from_env()?.model("claude-opus-4-8"))
    .fallback(OpenAiProvider::from_env()?.model("gpt-4o-mini"))
    .layer(TracingLayer::new())
    .layer(RetryLayer::exponential(3, Duration::from_millis(200)))
    .build()?;

let resp = client
    .chat(ChatRequest::builder().user("Hello!").build())
    .await?;
println!("{}", resp.text().unwrap_or_default());

Provider adapters are feature-gated (openai, anthropic, ollama; all on by default). Disable defaults and opt in to slim the dependency tree.

Modules§

prelude
Common imports for application code.
pricing
Per-model pricing lookup. Unknown models (e.g. local Ollama) return None.

Structs§

AnthropicProvider
Anthropic (Claude) provider over the /v1/messages API.
ChatBuilder
A chat request with optional registered tools, executed when awaited.
ChatRequest
A single-shot or streaming chat completion request.
ChatRequestBuilder
Ergonomic builder for ChatRequest.
ChatResponse
A completed chat response.
CostEstimate
Computed cost breakdown in USD.
CostTracking
Provider produced by CostTrackingLayer.
CostTrackingLayer
Tracks per-request and cumulative cost; optionally enforces a budget.
EmbedRequest
An embeddings request.
EmbedResponse
An embeddings response, one vector per input.
FallbackProvider
Tries providers in order, advancing to the next on a retryable failure.
LlmClient
A composed, ready-to-use LLM client.
LlmClientBuilder
Builds an LlmClient from a primary provider, optional fallbacks, and a stack of Tower layers.
Message
A single conversational message.
ModelAliases
Resolves model aliases to concrete model slugs.
ModelPricing
USD pricing per 1M tokens.
OllamaProvider
Ollama provider for local models (Llama, Mistral, …).
OpenAiProvider
OpenAI provider (GPT-4o, o1, embeddings).
RateLimit
Provider produced by RateLimitLayer.
RateLimitLayer
Token-bucket rate limiter: capacity tokens refilling over window.
Retry
Provider produced by RetryLayer.
RetryLayer
Configures exponential-backoff retries for retryable errors.
SessionCost
Shared, cloneable handle to the running session cost (USD), in micro-dollars.
TokenUsage
Normalised token counts for one request.
Tool
A tool the model may call, defined by a name, description, and JSON Schema.
ToolCall
A tool invocation requested by the model.
ToolResult
The result of executing a ToolCall, returned to the model.
Tracing
Provider produced by TracingLayer.
TracingLayer
Emits a tracing span around each call with latency and usage.

Enums§

ContentPart
A typed part of multi-part message content.
FinishReason
Why the model stopped generating.
LlmError
The single error type returned by every llmkit operation.
MessageContent
The content of a Message.
Role
Role of a Message.
StreamDelta
One incremental event in a streaming chat response.
ToolChoice
How the model should choose among available tools.

Traits§

LlmLayer
Wraps an inner LlmProvider in a new one, adding cross-cutting behaviour.
LlmProvider
A unified, async LLM backend.
ToolSchema
Implemented by tool input structs to expose name, description, and schema.

Type Aliases§

ChatStream
A boxed async stream of StreamDelta items, the unified streaming output across every provider.
LlmResult
Result alias used throughout llmkit.

Derive Macros§

ToolSchema
#[derive(ToolSchema)] for typed tool inputs. Derive llmkit_core::ToolSchema for a struct.