openheim
A fast, multi-provider LLM agent runtime built in Rust.
Openheim runs an iterative agent loop — it calls your LLM, executes tools on its behalf, feeds results back, and repeats until the task is done. It works as an interactive REPL, a headless CLI, an ACP stdio agent (for Zed, Claude Code, and other ACP clients), or a self-hosted ACP-over-WebSocket server.
Why Rust?
Openheim is built in Rust from the ground up:
- Low memory — runs in a fraction of the RAM a Python agent would need
- Fast startup — no interpreter warmup
- True concurrency — async Tokio runtime, multiple agents without threading headaches
- Safe by default — Rust's ownership model means fewer footguns in long-running agent processes
Features
- Multi-provider — OpenAI, Anthropic Claude, Google Gemini, and any OpenAI-compatible endpoint (Ollama, vLLM, LM Studio, etc.)
- Tool execution — built-in shell, file read, and file write tools. Trait-based, so you can add your own.
- MCP (Model Context Protocol) — connect external MCP servers (stdio or Streamable HTTP) and their tools are automatically exposed to the LLM as
{server_name}__{tool_name}. - Conversation memory — conversations (including full tool call history) persist to disk and resume across sessions
- Skills — drop a markdown file into
~/.openheim/skills/and it's prepended to the system prompt. ACP clients can also pass skills per-session via_meta. - ACP transport — implements the Agent Client Protocol over stdio (for editor integrations) and WebSocket (for remote clients), with real-time streaming of message chunks and tool calls
- Unified WebSocket — single multiplexed
WS /wsconnection carries both ACP agent traffic (sessions, streaming, tool calls) and filesystem operations (file CRUD, live watching) via channel envelopes - Retry with backoff — transient failures (429s, 5xx, network errors) are retried automatically with exponential backoff
- Docker ready — multi-stage Dockerfile and docker-compose included
Quickstart
Prerequisites
- Rust 1.85+
- An API key for at least one supported provider
Install
Configure
# Generate the default config
# Edit it
Example config:
= "openai"
= 10
[]
= "https://api.openai.com/v1"
= "gpt-4"
= ["gpt-4", "gpt-4-turbo", "gpt-3.5-turbo"]
= "OPENAI_API_KEY"
# timeout_secs = 120 # Request timeout in seconds (default: 120)
# max_tokens = 4096 # Maximum output tokens for LLM responses
[]
= "https://api.anthropic.com/v1"
= "claude-sonnet-4-5-20250929"
= ["claude-sonnet-4-5-20250929", "claude-3-5-sonnet-20241022", "claude-3-opus-20240229"]
= "ANTHROPIC_API_KEY"
[]
= "https://generativelanguage.googleapis.com/v1beta"
= "gemini-2.5-flash"
= ["gemini-2.5-flash", "gemini-2.5-pro"]
= "GEMINI_API_KEY"
# Local Ollama (no API key needed)
[]
= "http://localhost:11434/v1"
= "llama2"
= ["llama2", "mistral", "codellama"]
# MCP servers — tools are exposed as "{server_name}__{tool_name}"
# [mcp_servers.filesystem]
# command = "npx"
# args = ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
#
# [mcp_servers.remote-tools]
# url = "http://localhost:8080/mcp"
Run
# Interactive REPL (default — no subcommand)
# Load skills in the REPL
# Single headless prompt, streams to stdout
# Single headless prompt with a model override
# ACP stdio agent (for Zed, Claude Code, and other ACP clients)
# ACP-over-WebSocket server
# Initialize config
How the agent loop works
User prompt
│
▼
Send conversation + tools → LLM
│
├─ Tool call requested? → Execute tool → feed result back → repeat
│
└─ Final response → done
Conversations are saved to ~/.openheim/history/ as JSON after every run.
Skills
Skills are markdown files in ~/.openheim/skills/. When loaded, their content is injected into the system prompt before the conversation starts.
Use them to give the agent a persona, a set of coding standards, domain knowledge, or anything you'd otherwise paste into the system prompt every time.
# Run the REPL with specific skills loaded
ACP clients (Zed, Claude Code, etc.) can pass skills per-session by including a skills array in the _meta field of the NewSession request — no flag needed on the server side.
Server mode
Start with cargo run -- serve (defaults to 0.0.0.0:1217).
The server speaks the Agent Client Protocol over WebSocket and exposes a multiplexed WS endpoint plus REST API routes:
WebSocket
| Endpoint | Description |
|---|---|
WS /ws |
Single multiplexed connection carrying two channels via JSON envelopes: agent (ACP sessions, streaming, tool calls) and fs (file CRUD, live watching) |
Every message is wrapped in { "channel": "<agent|fs>", "data": <payload> }.
REST API
| Endpoint | Description |
|---|---|
GET /api/config |
Public config (providers, models — API keys stripped) |
GET /api/models |
Available models per provider |
GET /api/skills |
List of installed skills |
GET /api/tools |
All registered tool definitions (built-in + MCP) |
GET /api/mcp-servers |
MCP server connection statuses |
GET /api/sessions |
All persisted sessions (metadata only, newest first) |
GET /api/sessions/{id} |
Full conversation — messages, tool calls, and metadata |
Frontend / WebSocket implementors: see OPENHEIM_SPEC.md for the complete protocol reference, TypeScript interfaces, and sequence diagrams.
Use as a library
Openheim can be embedded directly in your Rust application via the openheim crate. The library exposes the full agent runtime — sessions, streaming, conversation history, skills, and MCP servers — through a single OpenheimClient facade.
# Cargo.toml
[]
= { = "../openheim-core" }
= { = "1", = ["full"] }
See docs/library.md for the full API reference, session management, multi-turn conversations, and MCP integration.
Docker
# Build and start with docker-compose
# Or run manually
Project structure
src/
main.rs Entry point and subcommand dispatch
lib.rs Public API surface
error.rs Error types (with retryable classification for backoff)
config/ Config loading, provider/model resolution, defaults
core/
agent.rs Agent loop (streaming variant)
models.rs Message, Tool, Choice, and WebSocket envelope types
llm/ LLM client trait and provider implementations
anthropic.rs Anthropic Messages API client
gemini.rs Google Gemini API client
openai.rs OpenAI API client
openai_compatible.rs Generic OpenAI-compatible client (Ollama, etc.)
retry.rs Automatic retry with exponential backoff
tools/ Tool trait, registry, and built-in tools
execute_command.rs / read_file.rs / write_file.rs
mcp/ MCP (Model Context Protocol) client integration
client.rs MCP server connection (stdio + Streamable HTTP)
tool_handler.rs Adapts MCP tools to the ToolHandler trait
rag/ Conversation history, prompt builder, and skills manager
acp/ ACP agent core — session state and protocol handling
transport/
stdio.rs ACP-over-stdio transport (for editor integrations)
ws.rs Multiplexed WebSocket server (axum) + REST API + filesystem channel
run.rs Headless single-prompt transport
WS_SPEC.md Full WebSocket protocol reference for frontend implementors
tui/ Interactive rustyline REPL
Development
RUST_LOG=debug
Contributing
Contributions are welcome.
License
See LICENSE for details.