Crate llmkit

Expand description

§llmkit

A unified, async, multi-provider LLM client for Rust. One trait, one streaming API, one Tower middleware stack — across OpenAI, Anthropic, and local Ollama models, with no provider lock-in.

use llmkit::prelude::*;
use std::time::Duration;

let client = LlmClientBuilder::new()
    .provider(AnthropicProvider::from_env()?.model("claude-opus-4-8"))
    .fallback(OpenAiProvider::from_env()?.model("gpt-4o-mini"))
    .layer(TracingLayer::new())
    .layer(RetryLayer::exponential(3, Duration::from_millis(200)))
    .build()?;

let resp = client
    .chat(ChatRequest::builder().user("Hello!").build())
    .await?;
println!("{}", resp.text().unwrap_or_default());

Provider adapters are feature-gated (openai, anthropic, ollama; all on by default). Disable defaults and opt in to slim the dependency tree.

Modules§

prelude: Common imports for application code.
pricing: Per-model pricing lookup. Unknown models (e.g. local Ollama) return None.

Structs§

AnthropicProvider: Anthropic (Claude) provider over the /v1/messages API.
ChatBuilder: A chat request with optional registered tools, executed when awaited.
ChatRequest: A single-shot or streaming chat completion request.
ChatRequestBuilder: Ergonomic builder for ChatRequest.
ChatResponse: A completed chat response.
CostEstimate: Computed cost breakdown in USD.
CostTracking: Provider produced by CostTrackingLayer.
CostTrackingLayer: Tracks per-request and cumulative cost; optionally enforces a budget.
EmbedRequest: An embeddings request.
EmbedResponse: An embeddings response, one vector per input.
FallbackProvider: Tries providers in order, advancing to the next on a retryable failure.
LlmClient: A composed, ready-to-use LLM client.
LlmClientBuilder: Builds an LlmClient from a primary provider, optional fallbacks, and a stack of Tower layers.
Message: A single conversational message.
ModelAliases: Resolves model aliases to concrete model slugs.
ModelPricing: USD pricing per 1M tokens.
OllamaProvider: Ollama provider for local models (Llama, Mistral, …).
OpenAiProvider: OpenAI provider (GPT-4o, o1, embeddings).
RateLimit: Provider produced by RateLimitLayer.
RateLimitLayer: Token-bucket rate limiter: capacity tokens refilling over window.
Retry: Provider produced by RetryLayer.
RetryLayer: Configures exponential-backoff retries for retryable errors.
SessionCost: Shared, cloneable handle to the running session cost (USD), in micro-dollars.
TokenUsage: Normalised token counts for one request.
Tool: A tool the model may call, defined by a name, description, and JSON Schema.
ToolCall: A tool invocation requested by the model.
ToolResult: The result of executing a ToolCall, returned to the model.
Tracing: Provider produced by TracingLayer.
TracingLayer: Emits a tracing span around each call with latency and usage.

Enums§

ContentPart: A typed part of multi-part message content.
FinishReason: Why the model stopped generating.
LlmError: The single error type returned by every llmkit operation.
MessageContent: The content of a Message.
Role: Role of a Message.
StreamDelta: One incremental event in a streaming chat response.
ToolChoice: How the model should choose among available tools.

Traits§

LlmLayer: Wraps an inner LlmProvider in a new one, adding cross-cutting behaviour.
LlmProvider: A unified, async LLM backend.
ToolSchema: Implemented by tool input structs to expose name, description, and schema.

Type Aliases§

ChatStream: A boxed async stream of StreamDelta items, the unified streaming output across every provider.
LlmResult: Result alias used throughout llmkit.

Derive Macros§

ToolSchema: #[derive(ToolSchema)] for typed tool inputs. Derive llmkit_core::ToolSchema for a struct.

Crate llmkit

Crate llmkit Copy item path

§llmkit

Modules§

Structs§

Enums§

Traits§

Type Aliases§

Derive Macros§

Crate llmkit