zeph 0.14.1 - Docs.rs

zeph-0.14.1 is not a library.

The AI agent that respects your resources.

Single binary. Minimal hardware. Maximum context efficiency.

Zeph is a Rust AI agent built around one principle: every token in the context window must earn its place. Skills are retrieved semantically, tool output is filtered before injection, and the context compacts automatically under pressure — keeping costs low and responses fast on hardware you already own.

curl -fsSL https://github.com/bug-ops/zeph/releases/latest/download/install.sh | sh
zeph init   # interactive setup wizard
zeph        # start the agent

[!TIP] cargo install zeph also works. Pre-built binaries and Docker images are on the releases page.

What's inside

Feature	Description
Hybrid inference	Ollama, Claude, OpenAI, any OpenAI-compatible API, or fully local via Candle (GGUF). Multi-model orchestrator with fallback chains, EMA latency routing, and adaptive Thompson Sampling for exploration/exploitation-balanced model selection. → Providers
Skills-first architecture	YAML+Markdown skill files with BM25+cosine hybrid retrieval. Bayesian re-ranking, 4-tier trust model, and self-learning evolution — skills improve from real usage. Agent-as-a-Judge feedback detection with adaptive regex/LLM hybrid analysis. The `load_skill` tool lets the LLM fetch the full body of any skill outside the active TOP-N set on demand. → Skills · → Self-learning
Context engineering	Semantic skill selection, command-aware output filters, tool-pair summarization with deferred application (pre-computed eagerly, applied lazily to stabilize the Claude API prompt cache prefix), proactive context compression (reactive + proactive strategies), and reactive middle-out compaction keep the window efficient under any load. Three-tier compaction pipeline: deferred summary application at 70% context usage → pruning at 80% → LLM compaction on overflow. `--debug-dump [PATH]` writes every LLM request, response, and raw tool output to numbered files for context debugging. → Context · → Debug Dump
Semantic memory	SQLite + Qdrant with MMR re-ranking, temporal decay, query-aware memory routing (keyword/semantic/hybrid), cross-session recall, implicit correction detection, and credential scrubbing. Optional graph memory adds entity-relationship tracking with FTS5-accelerated entity search, BFS traversal for multi-hop reasoning, temporal fact tracking, and embedding-based entity resolution for semantic deduplication. Background LLM extraction runs fire-and-forget on each turn; graph facts are injected into the context window alongside semantic recall. → Memory · → Graph Memory
IDE integration (ACP)	Stdio, HTTP+SSE, or WebSocket transport. Session modes, live tool streaming, LSP diagnostics injection, file following, usage reporting. Works in Zed, Helix, VS Code. → ACP
Multi-channel I/O	CLI, Telegram, TUI dashboard — all with streaming. Voice and vision input supported. → Channels
MCP & A2A	MCP client with full tool exposure to the model. Configure mcpls as an MCP server for compiler-level code intelligence: hover, definition, references, diagnostics, call hierarchy, and safe rename via rust-analyzer, pyright, gopls, and 30+ other LSP servers. A2A agent-to-agent protocol for multi-agent orchestration. → MCP · → LSP · → A2A
Sub-agents	Spawn isolated agents with scoped tools, skills, and zero-trust secret delegation — defined as Markdown files. 4-level resolution priority (CLI > project > user > config), `permission_mode` (`default`/`accept_edits`/`dont_ask`/`bypass_permissions`/`plan`), fine-grained `tools.except` denylists, `background` fire-and-forget execution, `max_turns` limits, persistent memory scopes (`user`/`project`/`local`) with MEMORY.md injection, persistent JSONL transcript storage with `/agent resume` for continuing completed sessions, and lifecycle hooks (`SubagentStart`/`SubagentStop` at config level, `PreToolUse`/`PostToolUse` per agent with pipe-separated matchers). Manage definitions with `zeph agents list
Instruction files	Drop `zeph.md` (or `CLAUDE.md` / `AGENTS.md`) in your project root. Zeph auto-detects and injects them into every system prompt — project rules, conventions, and domain knowledge applied automatically. Changes are picked up live via filesystem watching (500 ms debounce) — no restart required. → Instruction Files
Defense-in-depth	Shell sandbox, SSRF protection, skill trust quarantine, secret zeroization, audit logging, `unsafe_code = "deny"` workspace-wide. Untrusted content isolation: all tool results, web scrape output, MCP responses, A2A messages, and memory retrieval pass through a `ContentSanitizer` pipeline that truncates, strips control characters, detects 17 injection patterns, and wraps content in spotlighting XML delimiters before it enters the LLM context. Optional quarantined summarizer (Dual LLM pattern) routes high-risk sources through an isolated, tool-less LLM extraction call for defense-in-depth against indirect prompt injection. Exfiltration guards block markdown image pixel-tracking, validate tool call URLs against flagged untrusted sources, and suppress memory writes for injection-flagged content. TUI security panel with real-time event feed, SEC status bar indicator, and `security:events` command palette entry. → Security · → Untrusted Content Isolation
Document RAG	`zeph ingest <path>` indexes `.txt`, `.md`, `.pdf` into Qdrant. Relevant chunks surface automatically on each turn. → Document loaders
Task orchestration	DAG-based task graphs with dependency tracking, parallel execution, configurable failure strategies (abort/retry/skip/ask), timeout enforcement, and SQLite persistence. LLM-powered goal decomposition via `Planner` trait with structured output. Tick-based `DagScheduler` execution engine with command pattern, `AgentRouter` trait for task-to-agent routing, cross-task context injection with `ContentSanitizer` integration, and stale event guards. LLM-backed result aggregation (`Aggregator` trait, `LlmAggregator`) synthesizes completed task outputs into a single coherent response with per-task token budgeting and raw-concatenation fallback. `/plan` CLI commands: `/plan <goal>` (decompose + confirm + execute), `/plan status`, `/plan list`, `/plan cancel`, `/plan confirm`, `/plan resume [id]` (resume a paused Ask-strategy graph), `/plan retry [id]` (re-run failed tasks). TUI Plan View side panel (press `p`) shows live task status with per-row spinners and status colors; five `plan:*` command palette entries available. `OrchestrationMetrics` tracked in `MetricsSnapshot`. Enabled by the `orchestration` feature flag (included in `full`). → Orchestration
Daemon & scheduler	HTTP webhook gateway with bearer auth. Cron-based periodic tasks and one-shot deferred tasks with SQLite persistence — add, update, or cancel tasks at runtime via natural language using the built-in `scheduler` skill. Background mode. → Daemon
Single binary	~15 MB, no runtime dependencies, ~50 ms startup, ~20 MB idle memory.

┌─ Skills (3/12) ────────────────────┐┌─ MCP Tools ─────────────────────────┐
│  web-search  [████████░░] 82% (117)││  - filesystem/read_file             │
│  git-commit  [███████░░░] 73%  (42)││  - filesystem/write_file            │
│  code-review [████░░░░░░] 41%   (8)││  - github/create_pr                 │
└────────────────────────────────────┘└─────────────────────────────────────┘

Documentation

Full documentation — installation, configuration, guides, and architecture reference — at bug-ops.github.io/zeph.

Contributing

See CONTRIBUTING.md. Found a vulnerability? Use GitHub Security Advisories.

License

MIT