Cersei
The complete Rust SDK for building coding agents.
Cersei gives you every building block of a production coding agent — tool execution, LLM streaming, sub-agent orchestration, persistent memory, skills, MCP integration — as composable library functions. Build a Claude Code replacement, embed an agent in your app, or create something entirely new.
use *;
async
MIT License | Built by Adib Mohsin | Docs | GitHub
Why Cersei
| Claude Code | OpenCode | Cersei SDK | Abstract CLI | |
|---|---|---|---|---|
| Form factor | CLI app | CLI app | Library | CLI app |
| Embeddable | No | No | Yes | No (uses SDK) |
| Provider | Anthropic only | Multi-provider | Multi-provider | Multi-provider |
| Language | TypeScript | TypeScript | Rust | Rust |
| Custom tools | Plugins | Plugins | impl Tool / #[derive(Tool)] |
Via SDK |
| Startup | ~269ms | ~300ms | N/A (library) | ~34ms |
| Binary / RSS | 174MB / 330MB | — | N/A | 5.8MB / 4.9MB |
| Memory | File-based | SQLite | File + Graph | File + Graph |
| Skills | .claude/commands/ |
.claude/skills/ |
Both formats | Both formats |
Cersei is built from the architecture of Claude Code (reverse-engineered Rust port) and designed so that anyone can build a complete, drop-in replacement for Claude Code, OpenCode, or any coding agent — as a library call.
Abstract — The CLI
Abstract is a complete CLI coding agent built on Cersei. One binary, zero runtime dependencies, graph memory by default.
# Install
# Use
Abstract vs Claude Code
All numbers from run_tool_bench.sh --full.
| Metric | Abstract | Claude Code | Winner |
|---|---|---|---|
| Startup (warm) | 32ms | 266ms | Abstract (8.2x) |
| Binary size | 6.0 MB | 174 MB | Abstract (29x) |
| Memory (RSS) | 4.9 MB | 333 MB | Abstract (68x) |
| Tool dispatch | 0.02-17ms | 5-265ms+ | Abstract |
| Memory recall | 98us (graph) | 7,545ms (LLM) | Abstract (77,000x) |
| Memory write | 30us (graph) | 20,687ms (agent) | Abstract (689,000x) |
| MEMORY.md load | 9.6us | 17.1ms | Abstract (1,781x) |
| Sequential throughput | 906ms/req | 12,079ms/req | Abstract (13.3x) |
| System prompt tokens | ~2,200 | ~8,000+ | Abstract (3.6x fewer) |
| LLM call for recall | Not needed | Required (Sonnet) | Abstract |
Claude Code's memory recall calls Sonnet every turn to rank the top 5 files by relevance (7.5s measured). Abstract's graph does indexed lookups in 98 microseconds — same capability, no LLM call, no API cost.
Full benchmark: crates/abstract-cli/benchmarks/REPORT.md
Features
- 34 built-in tools (file, shell, web, planning, orchestration, scheduling)
- Multi-provider: Anthropic + OpenAI (+ Ollama, Azure, vLLM)
- Graph memory (Grafeo) on by default
- Auto-compact, auto-dream, effort levels (Low/Medium/High/Max)
- MCP server support
- Session persistence (Claude Code-compatible JSONL)
- Interactive permissions with session caching
- 12 slash commands (
/help,/commit,/review,/memory,/model,/diff, etc.) - Streaming markdown rendering with syntax highlighting
- TOML config:
~/.abstract/config.toml+.abstract/config.toml - JSON output mode for piping (
--json)
Install
[]
= { = "https://github.com/pacifio/cersei" }
= { = "1", = ["full"] }
= "1"
For graph-backed memory (optional):
= { = "https://github.com/pacifio/cersei", = ["graph"] }
Architecture
cersei Facade crate — use cersei::prelude::*;
cersei-types Provider-agnostic messages, errors, stream events
cersei-provider Provider trait + Anthropic/OpenAI implementations
cersei-tools 30+ tools, permissions, bash classifier, skills, git utils
cersei-tools-derive #[derive(Tool)] proc macro
cersei-agent Agent builder, agentic loop, compact, coordinator, effort
cersei-memory Memory trait, memdir, CLAUDE.md, sessions, Grafeo graph
cersei-hooks Hook/middleware system
cersei-mcp MCP client (JSON-RPC 2.0, stdio transport)
abstract-cli CLI coding agent ("abstract") — REPL, commands, config, permissions
Core Concepts
Provider
Any LLM backend. Built-in: Anthropic (with OAuth), OpenAI (compatible with Ollama, Azure, vLLM).
builder.provider // Anthropic API key
builder.provider
builder.provider // impl Provider
Tools (30+)
Every tool a coding agent needs, organized into sets:
all // 30+ tools
coding // filesystem + shell + web
filesystem // Read, Write, Edit, Glob, Grep, NotebookEdit
shell // Bash, PowerShell
web // WebFetch, WebSearch
planning // EnterPlanMode, ExitPlanMode, TodoWrite
scheduling // CronCreate/List/Delete, Sleep, RemoteTrigger
orchestration // SendMessage, Tasks (6 tools), Worktree
Custom tools in 10 lines:
The
#[derive(Tool)]macro generates code with#[async_trait::async_trait]andcercei-tools, to make it work add both of it to depending on your project.= "0.1" = { = "path/to/cersei" } # or git = { = "path/to/cersei/crates/cersei-tools" }or write
use cersei::tools as cersei_tools;when usingderive(Tool);
;
Sub-Agent Orchestration
Spawn parallel workers, coordinate tasks, pass messages between agents:
// AgentTool — model spawns sub-agents autonomously
builder
.tool
// Coordinator mode — orchestrate parallel workers
builder
.tools // includes Agent, Tasks, SendMessage
// Workers get filtered tools (no Agent — prevents recursion)
// Task system
// TaskCreate → TaskUpdate → TaskGet → TaskList → TaskStop → TaskOutput
Memory (Three-Tier)
use MemoryManager;
let mm = new
.with_graph?; // optional graph layer
// Tier 1: Flat files (~/.claude/projects/<root>/memory/)
let metas = mm.scan; // scan .md files with frontmatter
let content = mm.build_context; // build system prompt injection
// Tier 2: CLAUDE.md hierarchy (managed > user > project > local)
// Automatically merged into build_context()
// Tier 3: Graph memory (Grafeo, optional)
let id = mm.store_memory?;
mm.tag_memory;
let results = mm.recall; // graph query with fallback to text match
// Session persistence (JSONL, append-only, tombstone soft-delete)
mm.write_user_message?;
let messages = mm.load_session_messages?;
Skills (Claude Code + OpenCode Compatible)
// Auto-discovers skills from:
// .claude/commands/*.md (Claude Code format)
// .claude/skills/*/SKILL.md (OpenCode format)
// ~/.claude/commands/*.md (user-level)
// Bundled skills (simplify, debug, commit, verify, stuck, remember, loop)
let skill_tool = new.with_project_root;
// skill="list" → lists all available skills
// skill="debug" args="tests are flaky" → expands $ARGUMENTS template
Realtime Events
Three observation mechanisms:
// 1. Callback
builder.on_event
// 2. Broadcast (multi-consumer)
let agent = builder.enable_broadcast.build?;
let mut rx = agent.subscribe.unwrap;
spawn;
// 3. Stream (bidirectional control)
let mut stream = agent.run_stream;
while let Some = stream.next.await
Context Management
builder
.auto_compact // summarize old messages at 90% context usage
.compact_threshold // trigger threshold
.tool_result_budget // truncate oldest tool results above 50K chars
.thinking_budget // extended thinking tokens
.effort // Low/Medium/High/Max
MCP (Model Context Protocol)
let mcp = connect.await?;
builder.tools
OAuth (Anthropic Native)
// Opens browser, PKCE flow, token storage, refresh
cargo run --example oauth_login
Agent Builder — Complete API
builder
// Provider (required)
.provider
// Tools
.tool
.tools
// Model & generation
.model
.max_turns
.max_tokens
.temperature
.thinking_budget
// Prompt
.system_prompt
.append_system_prompt
// Environment
.working_dir
.permission_policy // or AllowReadOnly, DenyAll, RuleBased, Interactive
// Memory
.memory
.session_id
// Hooks & events
.hook
.on_event
.enable_broadcast
.reporter
// Context management
.auto_compact
.compact_threshold
.tool_result_budget
// Execute
.build? // -> Agent
.run_with // -> AgentOutput (shorthand)
Benchmarks
Measured on Apple Silicon, release build, 100 iterations with 3 warmup runs.
Tool I/O
| Tool | Avg | Min | Max |
|---|---|---|---|
| Edit | 0.04ms | 0.02ms | 0.05ms |
| Glob | 0.05ms | 0.05ms | 0.07ms |
| Write | 0.09ms | 0.07ms | 0.11ms |
| Read | 0.09ms | 0.08ms | 0.11ms |
| Grep | 5.85ms | 5.34ms | 8.51ms |
| Bash | 15.64ms | 14.50ms | 16.19ms |
vs Claude Code CLI
Note: Cersei is a library — tool dispatch happens in-process. Claude Code is a CLI where each sub-agent fork pays full startup cost. These are different layers; the comparison below shows the gap between in-process dispatch and CLI process overhead.
| Metric | Cersei (SDK) | Claude Code (CLI) | Notes |
|---|---|---|---|
| Tool dispatch (Read) | 0.09ms | ~5-15ms (est.) | In-process vs Node.js fs |
| CLI startup | N/A (library) | 269ms | Claude --version warm avg |
| Sub-agent spawn | ~1ms (in-process) | ~300ms (fork) | Agent tool overhead |
For an apples-to-apples CLI comparison, see Abstract CLI benchmarks.
Memory I/O
| Operation | Abstract (Cersei) | Claude Code (measured) | Ratio |
|---|---|---|---|
| Scan 100 files | 1.2ms | 26.6ms (find) |
22x |
| Load MEMORY.md | 9.6μs | 17.1ms | 1,781x |
| Memory recall (graph) | 98μs | 7,545ms (LLM call) | 77,000x |
| Memory recall (text) | 1.3ms | 17.5ms (grep) |
13x |
| Session write | 27μs/entry | N/A | — |
| Session load (100) | 268μs | N/A | — |
| Graph store | 30μs/node | N/A (no graph) | — |
| Topic query | 77μs | N/A (no graph) | — |
Benchmark suites
Each bench lives in its own self-contained directory with its own runner and result schema. Add new benches as siblings.
| Suite | Path | What it measures | Runner |
|---|---|---|---|
| General-agent frameworks | bench/general-agents/ |
Per-agent memory, instantiation time, max concurrent agents — Cersei vs Agno / PydanticAI / LangGraph / CrewAI. | ./bench/general-agents/run.sh |
| Terminal Bench 2.0 | bench/term-bench/ |
End-to-end coding tasks inside Daytona sandboxes using the full abstract CLI (Linux x86_64 / arm64 binaries shipped in-tree). |
./bench/term-bench/run.sh |
| LongMemEval (long-term memory) | bench/long-mem/ |
Recall accuracy on the ICLR-25 LongMemEval 500-question benchmark — head-to-head vs Mastra / Zep / Supermemory with identical prompts and LLM-as-judge rubric. Four Cersei configs: full-context baseline, usearch-HNSW semantic, grafeo-graph substring, hybrid w/ LLM fact extraction + RRF fusion. | cargo run --release -p longmem-bench -- --dataset s --config all |
| Compression (real LLMs) | crates/cersei-agent/tests/e2e_openai_compression.rs |
Input-token savings from cersei-compression on OpenAI (gpt-4o-mini) and Gemini (gemini-2.5-flash). #[ignore], runs with real API keys. |
cargo test -p cersei-agent --test e2e_openai_compression -- --ignored --nocapture |
| SDK Tool I/O | examples/benchmark_io.rs |
In-process tool dispatch latency for Read / Write / Edit / Grep / Bash / Glob. | cargo run --example benchmark_io --release |
| SDK Memory I/O | crates/abstract-cli/examples/memory_bench.rs |
Graph-memory vs filesystem vs Claude Code-style paths. | cargo run -p abstract-cli --example memory_bench --release |
| vs Claude Code CLI | run_tool_bench_claude.sh · run_tool_bench_codex.sh |
CLI-vs-CLI startup, memory, and dispatch overhead. | ./run_tool_bench.sh --iterations 20 --full |
Run benchmarks
# Rust-side SDK benches (no external services)
# vs Claude Code / Codex CLIs
# Python-harness benches (uv-managed; each dir self-contained)
# LongMemEval memory benchmark (head-to-head vs Mastra / Zep / Supermemory)
OPENAI_API_KEY=sk-…
# Real-LLM compression savings (requires API keys)
OPENAI_API_KEY=sk-…
Stress Tests
Examples
| Example | Description |
|---|---|
simple_agent |
Minimal agent in 3 lines |
custom_tools |
Define and register custom tools |
streaming_events |
Real-time run_stream() with colored output |
multi_listener |
Broadcast channel with multiple consumers |
resumable_session |
Persist and resume with JsonlMemory |
custom_provider |
Echo provider + OpenAI-compatible endpoints |
hooks_middleware |
Cost guard + audit logger + tool blocker |
benchmark_io |
Full I/O benchmark suite |
usage_report |
Token/cost tracking and billing estimates |
coding_agent |
Build a Python todo CLI (end-to-end) |
oauth_login |
Anthropic OAuth PKCE login flow |
Test Suite
# Run all 160 unit tests
# Run with graph memory (requires grafeo)
# Run specific crate
160 unit tests | 262 stress checks | 0 failures | Zero I/O regression
Extension Points
| What | How | Example |
|---|---|---|
| Custom provider | impl Provider |
Local LLM, Azure, Bedrock |
| Custom tool | #[derive(Tool)] or impl Tool |
DB query, deploy, search |
| Custom permissions | impl PermissionPolicy |
RBAC, OAuth-scoped |
| Custom memory | impl Memory |
PostgreSQL, Redis, S3 |
| Custom hooks | impl Hook |
Cost gating, audit logging |
| Custom reporters | impl Reporter |
Dashboards, WebSocket relay |
| MCP servers | McpServerConfig via builder |
Any MCP-compatible server |
| Skills | .claude/commands/*.md |
Custom prompt templates |
| Graph memory | features = ["graph"] |
Grafeo relationship tracking |
Documentation
cersei.pacifio.dev/docs — full docs with API reference, architecture, cookbooks, benchmarks, and llms.txt support.
| Section | Content |
|---|---|
| Quick Start | First agent in 10 lines |
| API Reference | Agent, Provider, Tools, Memory, Hooks, MCP |
| Architecture | Crate map, data flow, design principles |
| Cookbooks | Custom tools, deployment, embedding |
| Abstract CLI | Reference CLI built on Cersei |
| Benchmarks | vs Claude Code vs Codex |
License
MIT License
Copyright (c) 2025 Adib Mohsin
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.