Demo video: View on GitHub (12MB)
What This Is
Decapod is agent infrastructure: a governance runtime for autonomous software agents.
Like Docker is a runtime for containers, Decapod is a runtime for agents. It gives autonomy a place to live that isn’t your chat window: persistent state, binding methodology, proof gates, and coordination primitives—so agent work becomes shippable, not scary.
Agents can write code. But they can’t reliably ship because they:
- forget what they built yesterday (no persistence)
- treat best practices as vibes (no enforcement)
- say “done” without evidence (no proof gates)
- trip over each other in parallel (no coordination)
You set Decapod up once (decapod init), then agents operate inside the governed environment. You don’t touch the internals—just like you don’t touch individual neurons.
What This Is Not
Decapod is not:
- a prompt pack
- an agent framework/library
- a hosted SaaS “agent platform”
- a review bot that just comments on PRs
- a tool you manually operate (it’s agent infrastructure)
Decapod is an environment: the place agent work becomes enforceable.
Getting Started
From that point on, agents operate inside the governed environment. You observe outcomes, review summaries, and merge when proofs pass.
Who This Is For
✅ You’re shipping production code with AI agents
✅ You want discipline enforced by the environment
✅ You want parallel agents without turning the repo into lore
✅ You merge to main (not just demoing prompts)
✅ You want an AI companion for building premium software
✅ You want “AI vibes” with guardrails and customizable enforceable workflows
The Difference
Without Decapod:
You: “Add OAuth to the login flow”
Agent: Writes 500 lines across 8 files
You: Review everything manually
You: Find broken tests, ignored conventions, missing error paths
Agent: Forgets context when you ask for fixes
With Decapod:
You: “Add OAuth to the login flow”
Agent: Checks recorded conventions and constraints
Agent: Produces tracked work, records decisions
Agent: Runs proof gates, fixes failures, re-validates
Agent: Marks work done with an auditable trail
You: Review summary and merge
Security
Decapod is designed with security at the foundation. See SECURITY.md for:
- Credential architecture and lifecycle management
- Agent identity and session security
- Supply chain integrity
- Incident response philosophy
TL;DR: Agents must handle credentials securely—never log, never commit, always rotate. Violations are constitutional breaches.
How It Works
1) Persistent State (Memory That Survives)
Agents persist work to .decapod/: todos, conventions, decisions, proof events—durable state that survives sessions and model switches.
You get continuity without re-explaining. Agents get a real memory substrate instead of fragile chat history.
2) Enforced Methodology (Constitution as Code)
Decapod ships an embedded constitution: binding contracts for how agents must operate (intent-first flow, authority chains, proof doctrine, store separation, etc).
Generated entrypoints (CLAUDE.md, AGENTS.md, GEMINI.md) require agents to:
- read the constitution before acting
- use the control surface for state mutation (no internal access)
- follow Intent → Architecture → Implementation → Proof
- pass validation gates before claiming “done”
Projects override behavior via .decapod/OVERRIDE.md without forking the constitution.
3) Proof Gates (Validation Before Promotion)
Promotion isn’t a vibe. It’s a check that can fail.
Agents must satisfy proof gates before completion is credible. If validation fails, the work isn’t done—no matter how confident the summary sounds. Evidence required, not assertions.
4) Coordination Primitives (So Parallel Doesn’t Mean Chaos)
Decapod standardizes the surfaces agents use to collaborate:
- a shared backlog with audit trail
- shared conventions and preferences
- shared rationale (decisions, constraints, invariants)
- a proof ledger (what passed, what failed, when, and why)
- policy boundaries (trust tiers, risk zones)
- (planned) safe multi-writer state via a DB broker
Multiple agents can work in parallel without collisions, duplicate effort, or lost context.
Architecture
┌──────────────────────────────────────────┐
│ Agent Entrypoints (CLAUDE.md, etc) │ ← Generated by init
├──────────────────────────────────────────┤
│ Control Surface (stable interface) │ ← Agents interact here
├──────────────────────────────────────────┤
│ Subsystems (plugin-grade surfaces) │ ← Domain logic
├──────────────────────────────────────────┤
│ Governance Core (validate + doctrine) │ ← Enforcement layer
├──────────────────────────────────────────┤
│ State Layer (SQLite + event logs) │ ← Persistence + audit
├──────────────────────────────────────────┤
│ Embedded Constitution (methodology) │ ← Contracts, not tips
└──────────────────────────────────────────┘
Storage:
<your-project>/
└── .decapod/
├── data/ # State (agents write via control surface) !! DO NOT TOUCH
├── generated/ # Entrypoints + derived files (auto-managed) !! DO NOT TOUCH
└── OVERRIDE.md # Edit this file to manually override any constitution contract layer.
You don’t touch .decapod/data/ directly. Agents use the control surface. Like neurons—they’re there, they work, you don’t manipulate them individually.
Real-World Scenarios
Scenario 1: Preference Memory
You tell an agent once: “Always use SEMVER for tagging git commits, and set that value in the Cargo.toml and Cargo.lock before pushing.”
That preference becomes durable state. Every future agent session in this project can check it and will use it. You never explain again.
Scenario 2: Multi-Agent Feature Work
Work is tracked. Agents claim separate items, operate in parallel, and must pass proof gates before marking work done. No duplicate effort. No coordination bugs. No lost context.
Scenario 3: Proof-Gated Promotion
An agent thinks it’s done. Proof gates fail. It can’t credibly claim completion until it fixes the failures and re-validates. That’s the difference between autonomy and theater.
Ecosystem Status
Real today (foundation):
- Local-first repo runtime (initialize once, agents use it)
- Constitution routing + discovery (agents read, projects override)
- Proof gates (validation must pass)
- Core subsystems operational
In progress:
- DB Broker (multi-agent safe writes)
- Handoff/context passing surfaces
Planned:
- Trust automation (earn autonomy through proof history)
- Policy DSL (risk zones with approvals)
- Pattern learning (conventions inferred from repo)