decapod 0.2.0

Decapod is a Rust-built governance runtime for AI agents: repo-native state, enforced workflow, proof gates, safe coordination.
Documentation

Demo video: View on GitHub (12MB)


What This Is

Decapod is agent infrastructure: a governance runtime for autonomous software agents.

Like Docker is a runtime for containers, Decapod is a runtime for agents. It gives autonomy a place to live that isn’t your chat window: persistent state, binding methodology, proof gates, and coordination primitives—so agent work becomes shippable, not scary.

Agents can write code. But they can’t reliably ship because they:

  • forget what they built yesterday (no persistence)
  • treat best practices as vibes (no enforcement)
  • say “done” without evidence (no proof gates)
  • trip over each other in parallel (no coordination)

You set Decapod up once (decapod init), then agents operate inside the governed environment. You don’t touch the internals—just like you don’t touch individual neurons.

What This Is Not

Decapod is not:

  • a prompt pack
  • an agent framework/library
  • a hosted SaaS “agent platform”
  • a review bot that just comments on PRs
  • a tool you manually operate (it’s agent infrastructure)

Decapod is an environment: the place agent work becomes enforceable.


Getting Started

cargo install decapod
cd <your-project>
decapod init

From that point on, agents operate inside the governed environment. You observe outcomes, review summaries, and merge when proofs pass.


Who This Is For

✅ You’re shipping production code with AI agents ✅ You want discipline enforced by the environment ✅ You want parallel agents without turning the repo into lore ✅ You merge to main (not just demoing prompts) ✅ You want an AI companion for building premium software ✅ You want “AI vibes” with guardrails and customizable enforceable workflows


Security

Decapod is designed with security at the foundation. See SECURITY.md for:

  • Credential architecture and lifecycle management
  • Agent identity and session security
  • Supply chain integrity
  • Incident response philosophy

TL;DR: Agents must handle credentials securely—never log, never commit, always rotate. Violations are constitutional breaches.


How It Works

1) Persistent State (Memory That Survives)

Agents persist work to .decapod/: todos, conventions, decisions, proof events—durable state that survives sessions and model switches.

You get continuity without re-explaining. Agents get a real memory substrate instead of fragile chat history.

2) Enforced Methodology (Constitution as Code)

Decapod ships an embedded constitution: binding contracts for how agents must operate (intent-first flow, authority chains, proof doctrine, store separation, etc).

Generated entrypoints (CLAUDE.md, AGENTS.md, GEMINI.md) require agents to:

  • read the constitution before acting
  • use the control surface for state mutation (no internal access)
  • follow Intent → Architecture → Implementation → Proof
  • pass validation gates before claiming “done”

Projects override behavior via .decapod/OVERRIDE.md without forking the constitution.

3) Proof Gates (Validation Before Promotion)

Promotion isn’t a vibe. It’s a check that can fail.

Agents must satisfy proof gates before completion is credible. If validation fails, the work isn’t done—no matter how confident the summary sounds. Evidence required, not assertions.

4) Coordination Primitives (So Parallel Doesn’t Mean Chaos)

Decapod standardizes the surfaces agents use to collaborate:

  • a shared backlog with audit trail
  • shared conventions and preferences
  • shared rationale (decisions, constraints, invariants)
  • a proof ledger (what passed, what failed, when, and why)
  • policy boundaries (trust tiers, risk zones)
  • (planned) safe multi-writer state via a DB broker

Multiple agents can work in parallel without collisions, duplicate effort, or lost context.


The Difference

Without Decapod:

You: “Add OAuth to the login flow”
Agent: Writes 500 lines across 8 files
You: Review everything manually
You: Find broken tests, ignored conventions, missing error paths
Agent: Forgets context when you ask for fixes

With Decapod:

You: “Add OAuth to the login flow”
Agent: Checks recorded conventions and constraints
Agent: Produces tracked work, records decisions
Agent: Runs proof gates, fixes failures, re-validates
Agent: Marks work done with an auditable trail
You: Review summary and merge

Architecture

┌──────────────────────────────────────────┐
│  Agent Entrypoints (CLAUDE.md, etc)     │  ← Generated by init
├──────────────────────────────────────────┤
│  Control Surface (stable interface)     │  ← Agents interact here
├──────────────────────────────────────────┤
│  Subsystems (plugin-grade surfaces)     │  ← Domain logic
├──────────────────────────────────────────┤
│  Governance Core (validate + doctrine)  │  ← Enforcement layer
├──────────────────────────────────────────┤
│  State Layer (SQLite + event logs)      │  ← Persistence + audit
├──────────────────────────────────────────┤
│  Embedded Constitution (methodology)    │  ← Contracts, not tips
└──────────────────────────────────────────┘

Storage:
<your-project>/
└── .decapod/
    ├── data/         # State (agents write via control surface)     !! DO NOT TOUCH
    ├── generated/    # Entrypoints + derived files (auto-managed)   !! DO NOT TOUCH
    └── OVERRIDE.md   # Edit this file to manually override any constitution contract layer. 

You don’t touch .decapod/data/ directly. Agents use the control surface. Like neurons—they’re there, they work, you don’t manipulate them individually.


Subsystems

Decapod’s control surface is organized into subsystems. Agents interact with these; you communicate your desires to the agent and observe outcomes.

Status legend:

  • REAL = implemented and usable today
  • SPEC = designed/claimed, but not fully shipped yet
Subsystem Purpose Status
todo Work tracking with audit trail REAL
validate Proof gate before promotion REAL
cron Scheduled automation REAL
reflex Rule-driven triggers/actions REAL
docs Constitution discovery REAL
teammate User conventions + preferences REAL
knowledge Project facts + rationale SPEC
health Proof ledger + system state SPEC
policy Risk zones + approvals SPEC
trust Autonomy tiers based on proof history SPEC
context Token budget management SPEC
archive Session history indexing SPEC
watcher Integrity checks SPEC
heartbeat One-shot system summary SPEC
feedback Preference refinement SPEC
db_broker Multi-agent SQLite safety (write serialization) SPEC

Real-World Scenarios

Scenario 1: Preference Memory

You tell an agent once: “Always use SEMVER for tagging git commits, and set that value in the Cargo.toml and Cargo.lock before pushing.”

That preference becomes durable state. Every future agent session in this project can check it and will use it. You never explain again.

Scenario 2: Multi-Agent Feature Work

Work is tracked. Agents claim separate items, operate in parallel, and must pass proof gates before marking work done. No duplicate effort. No coordination bugs. No lost context.

Scenario 3: Proof-Gated Promotion

An agent thinks it’s done. Proof gates fail. It can’t credibly claim completion until it fixes the failures and re-validates. That’s the difference between autonomy and theater.


Ecosystem Status

Real today (foundation):

  • Local-first repo runtime (initialize once, agents use it)
  • Constitution routing + discovery (agents read, projects override)
  • Proof gates (validation must pass)
  • Core subsystems operational

In progress:

  • DB Broker (multi-agent safe writes)
  • Handoff/context passing surfaces

Planned:

  • Trust automation (earn autonomy through proof history)
  • Policy DSL (risk zones with approvals)
  • Pattern learning (conventions inferred from repo)