The idea
Your agent writes code. Nobody checks its homework.
Decapod is the invisible layer between your agents and your code—humans never see it. It forces agents to answer three questions before touching a single line: What did the human actually ask for? What boundaries apply? How will we prove it's done? The answers become cryptographically verifiable artifacts, so the code that lands in your repo is provably what was intended.
It ships with an embedded constitution: governance docs agents receive as just-in-time context so they query the rules on demand instead of guessing them.
Decapod doesn't replace your agent. It doesn't replace your workflow. Humans never interact with it—agents call it on demand, it enforces the rules, and exits. Two commands to adopt. Zero config to maintain.
When it clicks
"The spec was vibes." Your agent asks Decapod what the user actually meant. Decapod forces intent to crystallize — constraints, boundaries, acceptance criteria — before a single line is generated. The agent stops hallucinating requirements.
"Three agents, one repo, total chaos." Decapod coordinates shared state across parallel runs. No silent overwrites. No drift. Each agent gets an isolated workspace with a provenance trail.
"It passes CI but is it done?" Decapod gates completion on proof artifacts, not narrative claims. VERIFIED means every gate in the proof plan actually passed — not "the agent said it looks good."
Related: Evaluating AGENTS.md (ETH SRI, 2026) on context-file quality and agent cost/performance.
Get running
Faster install (recommended)
For significantly faster installation (~30 seconds vs ~5 minutes), use cargo-binstall:
That's it. Keep using Claude Code, Codex, Gemini CLI, Cursor — whatever you already use. Decapod gets called by your agent automatically when control-plane decisions are needed. Your workflow doesn't change; the agent just gets smarter about when to stop and think.
What lands in your repo
.decapod/
config.toml # project configuration
data/ # durable state (governance, memory, traces)
generated/
specs/ # intent, architecture, validation specs
artifacts/ # proof artifacts, internalizations, provenance
sessions/ # per-session provenance logs
AGENTS.md # universal agent contract
CLAUDE.md / CODEX.md / GEMINI.md # tool-specific entrypoints
Every artifact lives as plain text in the repository. No external databases, no dashboards—the filesystem is the system of record.
How to know it's working
- Ask your agent to make a real change. Watch
.decapod/generated/populate with new specs and proof artifacts. - Ask your agent to validate the work. It will report typed pass/fail gates, not "looks good to me."
- Ask the agent "what did Decapod change about your plan?" — it should cite spec and proof steps, not vibes.
Agent integration: AGENTS.md and tool-specific entrypoints (CLAUDE.md, CODEX.md, GEMINI.md) define the full operational contract your agent follows.
Override any constitution default with plain English in .decapod/OVERRIDE.md. Learn more about the embedded constitution.
Why this exists
Coding agents suck. But it's not their fault.
You can't solve the world inside the agent. Like any serious technology, agents need infrastructure — a way to interface with the host machine (files, repos, terminals, policies) in a way that's intelligent, bounded, and provable.
The Unix philosophy ("do one thing well") breaks down the moment the "one thing" becomes: reason over ambiguous intent, plan work, write code, validate it, manage state, coordinate tools, and ship safely. We expect agents to generate great code. They mostly can. But the gaps aren't something you patch by making the agent fatter. The gaps exist because the agent isn't the right place for control-plane responsibilities.
Right now, agent makers keep stuffing more into the agent: task management, memory, rules, planning, codegen, toolchains, browsers — until it's mediocre at everything. Agents shouldn't be responsible for control-plane work. They shouldn't be your TODO database. They shouldn't be the place you encode a team's behavioral expectations. They shouldn't be the system of record for "what got done" or "what's allowed." That belongs in infrastructure.
Decapod is a repo-native governance kernel that agents call into — like a device driver for agent work. It makes intent explicit, boundaries explicit, and completion provable. The agent stays the brain. Decapod becomes the control plane that turns agent output into something shippable.
State is local and durable in .decapod/. Context, decisions, and traces persist across sessions and stay retrievable over time. Nothing hides. Nothing phones home.
How it works
Every Decapod operation returns one of three things:
| Signal | What it does | Think of it as |
|---|---|---|
| Advisory | Tightens intent, reduces wasted loops | Guardrails |
| Interlock | Hard policy boundary — blocks unsafe flow | Circuit breaker |
| Attestation | Structured proof that criteria actually passed | Receipt |
Human Intent
|
v
AI Agent(s) <----> Decapod <----> Repository + Policy
| | |
| | +-- Interlock (enforced boundaries)
| +----- Advisory (guided execution)
+-------- Attestation (verifiable outcomes)
What you get
- Daemonless. No background process. The binary starts, does its job, exits.
- Two-command install. Install and init. Done.
- Agent-agnostic. Works with Claude, Codex, Gemini, Cursor, and anything else that can shell out.
- Parallel-safe. Multiple agents, one repo, no collisions.
- Proof-gated completion.
VERIFIEDrequires passing proof-plan results, not narrative. - Fully auditable. Every decision, trace, and proof artifact lives in
.decapod/as plain files. - Context internalization. Turn long documents into mountable, verifiable context adapters with explicit source hashes, determinism labels, session-scoped attach leases, and explicit detach so agents stop re-ingesting the same 50-page spec every session.
The deep surface area — interfaces, capsules, eval kernel, knowledge promotions, obligation graphs — lives in the embedded constitution. Ask your agent to explore it.
What Decapod Guarantees
These are the things Decapod actually enforces — break any of these and decapod validate will fail:
| Guarantee | Description | Enforcement |
|---|---|---|
| Daemonless | No background process. Invoked on-demand, exits when done. | tests/daemonless_lifecycle.rs |
| Repo-native | All state lives in .decapod/ as plain files. Nothing phones home. |
File-based storage in src/core/store.rs |
| Proof-gated completion | VERIFIED requires passing proof-plan gates, not narrative claims. |
WorkUnit status machine + tests/workunit_publish_gate.rs |
| Workspace isolation | Agents cannot mutate protected branches directly. Must use isolated worktrees. | Git worktree enforcement + tests/workspace_interlock.rs |
| Bounded validation | decapod validate terminates in bounded time, never hangs. |
tests/validate_termination.rs + timeout enforcement |
| Store boundary | Agents must use CLI, not direct file access to .decapod/*. |
Validation gates + broker |
| Session required | Mutations require active session with credentials. | Session auth on mutation commands |
These are aspirational (we're working on them):
- Parallel-safe multi-agent coordination (partially enforced via workspace isolation)
- Context capsule deterministic output (partially enforced)
See .decapod/contracts/README_CONTRACTS.json for the full contract map and enforcement links.
Contributing
Docs
- CONTRIBUTING.md — development guide
- SECURITY.md — security policy
- CHANGELOG.md — release history
Support
License
MIT. See LICENSE.