Embacle — LLM Runners
Standalone Rust library that wraps 12 AI CLI tools and SDKs as pluggable LLM providers, with vision/image support.
Instead of integrating with LLM APIs directly (which require API keys, SDKs, and managing auth), Embacle delegates to CLI tools that users already have installed and authenticated — getting model upgrades, auth management, and protocol handling for free. For GitHub Copilot, an optional headless mode communicates via the ACP (Agent Client Protocol) for SDK-managed tool calling.
Install
Homebrew (macOS / Linux) — recommended
This installs two binaries:
embacle-server— OpenAI-compatible REST API + MCP server (start withembacle-server --provider copilot)embacle-mcp— standalone MCP server for editor integration
Once installed, start the server and send requests with any OpenAI-compatible client:
Docker
Cargo (library)
[]
= "0.12"
Supported Runners
CLI Runners (subprocess-based)
| Runner | Binary | Features |
|---|---|---|
| Claude Code | claude |
JSON output, streaming, system prompts, session resume |
| GitHub Copilot | copilot |
Text parsing, streaming |
| Cursor Agent | cursor-agent |
JSON output, streaming, MCP approval |
| OpenCode | opencode |
JSON events, session management |
| Gemini CLI | gemini |
JSON/stream-JSON output, streaming, session resume |
| Codex CLI | codex |
JSONL output, streaming, sandboxed exec mode |
| Goose CLI | goose |
JSON/stream-JSON output, streaming, no-session mode |
| Cline CLI | cline |
NDJSON output, streaming, session resume via task IDs |
| Continue CLI | cn |
JSON output, single-shot completions |
| Warp | oz |
NDJSON output, conversation resume |
| Kiro CLI | kiro-cli |
ANSI-stripped text output, auto model selection |
| Kilo Code | kilo |
NDJSON output, streaming, token tracking, 500+ models via Kilo Gateway |
HTTP API Runners (feature-flagged)
| Runner | Feature Flag | Features |
|---|---|---|
| OpenAI API | openai-api |
Any OpenAI-compatible endpoint (OpenAI, Groq, Gemini, Ollama, vLLM), streaming, tool calling, model discovery |
ACP Runners (persistent connection)
| Runner | Feature Flag | Features |
|---|---|---|
| GitHub Copilot Headless | copilot-headless |
NDJSON/JSON-RPC via copilot --acp, SDK-managed tool calling, streaming |
Quick Start
Use a CLI runner:
use PathBuf;
use ;
use ;
async
OpenAI API (feature flag)
Enable the openai-api feature for HTTP-based communication with any OpenAI-compatible endpoint:
[]
= { = "0.12", = ["openai-api"] }
use ;
use ;
async
Works with any OpenAI-compatible endpoint — OpenAI, Groq, Google Gemini, Ollama, vLLM, and more. To inject a shared HTTP client (e.g. from a connection pool), use OpenAiApiRunner::with_client(config, client).
| Variable | Default | Description |
|---|---|---|
OPENAI_API_BASE_URL |
https://api.openai.com/v1 |
API base URL |
OPENAI_API_KEY |
(none) | Bearer token for authentication |
OPENAI_API_MODEL |
gpt-5.4 |
Default model for completions |
OPENAI_API_TIMEOUT_SECS |
300 |
HTTP request timeout |
Copilot Headless (feature flag)
Enable the copilot-headless feature for ACP-based communication with SDK-managed tool calling:
[]
= { = "0.12", = ["copilot-headless"] }
use ;
use ;
async
The headless runner spawns copilot --acp per request and communicates via NDJSON-framed JSON-RPC. Configuration via environment variables:
| Variable | Default | Description |
|---|---|---|
COPILOT_CLI_PATH |
auto-detect | Override path to copilot binary |
COPILOT_HEADLESS_MODEL |
claude-opus-4.6-fast |
Default model for completions |
COPILOT_GITHUB_TOKEN |
stored OAuth | GitHub auth token (falls back to GH_TOKEN, GITHUB_TOKEN) |
Vision / Image Support
Embacle supports sending images alongside text prompts via the ImagePart type. Images are base64-encoded and tagged with a MIME type (PNG, JPEG, WebP, GIF).
Which providers support vision?
| Provider | Vision | Notes |
|---|---|---|
| Copilot Headless (ACP) | Yes | Images sent as ACP image content blocks |
| OpenAI API | Yes | Images sent as image_url parts with data: URIs |
| All CLI runners | No | CLI tools build text-only prompts via string concatenation |
Library usage
use ;
let image = new?;
let request = new;
Server usage (OpenAI multipart content)
Send images via the standard OpenAI multipart content format:
Plain string messages continue to work unchanged. Providers without vision capability will ignore image content (or reject it in strict mode via the capability guard).
MCP Server (embacle-mcp)
A library and standalone binary that exposes embacle runners via the Model Context Protocol. Connect any MCP-compatible client (Claude Desktop, editors, custom agents) to use all embacle providers.
Usage
# Stdio transport (default — for editor/client integration)
# HTTP transport (for network-accessible deployments)
MCP Tools
| Tool | Description |
|---|---|
get_provider |
Get active LLM provider and list available providers |
set_provider |
Switch the active provider (claude_code, copilot, copilot_headless, cursor_agent, opencode, gemini_cli, codex_cli, goose_cli, cline_cli, continue_cli, warp_cli, kiro_cli, kilo_cli) |
get_model |
Get current model and list available models for the active provider |
set_model |
Set the model for subsequent requests (pass null to reset to default) |
get_multiplex_provider |
Get providers configured for multiplex dispatch |
set_multiplex_provider |
Configure providers for fan-out mode |
prompt |
Send chat messages to the active provider, or multiplex to all configured providers |
Client Configuration
Add to your MCP client config (e.g. Claude Desktop claude_desktop_config.json):
REST API Server (embacle-server)
A unified OpenAI-compatible HTTP server with built-in MCP support that proxies requests to embacle runners. Any client that speaks the OpenAI chat completions API or MCP protocol can use it without modification. Supports --transport stdio for MCP-only mode (editor integration).
Usage
# Start with default provider (copilot) on localhost:3000
# Specify provider and port
# MCP-only mode via stdio (for editor/client integration)
Endpoints
| Method | Path | Description |
|---|---|---|
POST |
/v1/chat/completions |
Chat completion (streaming and non-streaming) |
GET |
/v1/models |
List available providers and models |
GET |
/health |
Per-provider readiness check |
POST |
/mcp |
MCP Streamable HTTP (JSON-RPC 2.0) |
MCP Streamable HTTP
The server also speaks MCP at POST /mcp, accepting JSON-RPC 2.0 requests. Any MCP-compatible client can connect over HTTP instead of stdio.
| Tool | Description |
|---|---|
prompt |
Send chat messages to an LLM provider, with optional model routing (e.g. copilot:gpt-4o) |
list_models |
List available providers and the server's default |
# MCP initialize handshake
# Call the prompt tool
Add Accept: text/event-stream to receive SSE-wrapped responses instead of plain JSON.
Model Routing
The model field determines which provider handles the request. Use a provider:model prefix to target a specific runner, or pass a bare model name to use the server's default provider.
# Explicit provider
# Default provider
Multiplex
Pass an array of models to fan out the same prompt to multiple providers concurrently. Each provider runs in its own task; failures in one don't affect others.
The response uses object: "chat.completion.multiplex" with per-provider results and timing.
Streaming is not supported for multiplex requests.
SSE Streaming
Set "stream": true for Server-Sent Events output in OpenAI streaming format (data: {json}\n\n with data: [DONE] terminator).
Authentication
Optional. Set EMBACLE_API_KEY to require bearer token auth on all endpoints. When unset, all requests are allowed through (localhost development mode). The env var is read per-request, so key rotation doesn't require a restart.
EMBACLE_API_KEY=my-secret
Docker
Pull the image from GitHub Container Registry:
The image includes embacle-server and embacle-mcp with Node.js pre-installed for adding CLI backends.
Adding a CLI Backend
The base image doesn't include CLI tools. Install them in a derived image:
FROM ghcr.io/dravr-ai/embacle
USER root
RUN npm install -g @anthropic-ai/claude-code
USER embacle
Build and run:
Auth and Configuration
CLI tools store auth tokens in their config directories. Mount them from the host, or set provider-specific env vars:
# Mount Claude Code auth from host
# Or pass env vars if the CLI supports them
Running embacle-mcp
Override the entrypoint to run the MCP server instead:
C FFI Static Library (Swift / C Integration)
Embacle provides a C FFI static library (libembacle.a) that exposes copilot chat completion to Swift and C programs. The FFI surface is 4 functions: init, chat completion, free string, and shutdown.
Install via Homebrew
This builds from source (requires Rust) and installs libembacle.a and embacle.h to Homebrew's prefix.
Install via script
Build manually
# Output: target/release/libembacle.a
# Header: include/embacle.h
Swift / SPM usage
Add a systemLibrary target in your Package.swift with a modulemap that links embacle:
.systemLibrary(name: "CEmbacle")
With a module.modulemap:
module CEmbacle {
header "embacle.h"
link "embacle"
export *
}
The FFI accepts OpenAI-compatible JSON — the same format as the REST API:
;
char* response = ;
/* use response JSON... */
;
;
Vision payloads work via multipart content with image_url data URIs.
Architecture
Your Application
└── embacle (this library)
│
├── CLI Runners (subprocess per request)
│ ├── ClaudeCodeRunner → spawns `claude -p "prompt" --output-format json`
│ ├── CopilotRunner → spawns `copilot -p "prompt"`
│ ├── CursorAgentRunner → spawns `cursor-agent -p "prompt" --output-format json`
│ ├── OpenCodeRunner → spawns `opencode run "prompt" --format json`
│ ├── GeminiCliRunner → spawns `gemini -p "prompt" -o json -y`
│ ├── CodexCliRunner → spawns `codex exec "prompt" --json --full-auto`
│ ├── GooseCliRunner → spawns `goose run --quiet --no-session`
│ ├── ClineCliRunner → spawns `cline task --json --act --yolo`
│ ├── ContinueCliRunner → spawns `cn -p --format json`
│ ├── WarpCliRunner → spawns `oz agent run --prompt "..." --output-format json`
│ ├── KiroCliRunner → spawns `kiro-cli send "prompt"`
│ └── KiloCliRunner → spawns `kilo run --auto --format json`
│
├── HTTP API Runners (behind feature flag)
│ └── OpenAiApiRunner → reqwest to any OpenAI-compatible endpoint
│
├── ACP Runners (persistent connection, behind feature flag)
│ └── CopilotHeadlessRunner → NDJSON/JSON-RPC to `copilot --acp`
│
├── Provider Decorators (composable wrappers)
│ ├── FallbackProvider → ordered chain with retry and exponential backoff
│ ├── MetricsProvider → latency, token, and cost tracking
│ ├── QualityGateProvider → response validation with retry
│ ├── GuardrailProvider → pluggable pre/post request validation
│ └── CacheProvider → response caching with TTL and capacity
│
├── Agent Loop
│ └── AgentExecutor → multi-turn tool calling with configurable max turns
│
├── Structured Output
│ └── request_structured_output() → schema-validated JSON extraction with retry
│
├── MCP Tool Bridge
│ └── McpToolBridge → MCP tool definitions ↔ text-based tool loop
│
├── MCP Server (library + binary crate)
│ └── embacle-mcp → JSON-RPC 2.0 over stdio or HTTP/SSE
│
├── Unified REST API + MCP Server (binary crate)
│ └── embacle-server → OpenAI-compatible HTTP, MCP Streamable HTTP, SSE streaming, multiplex
│
└── Tool Simulation (text-based tool calling for CLI runners)
└── execute_with_text_tools() → catalog injection, XML parsing, tool loop
All runners implement the same LlmProvider trait:
complete()— single-shot completioncomplete_stream()— streaming completionhealth_check()— verify the runner is available and authenticated
For detailed API docs — fallback chains, structured output, agent loop, metrics, quality gates, tool simulation, and more — see docs.rs/embacle.
Tested With
Embacle has been tested with mirroir.dev, an MCP server for AI-powered iPhone automation.
License
Licensed under the Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0).