decapod 0.3.0

Decapod is a Rust-built governance runtime for AI agents: repo-native state, enforced workflow, proof gates, safe coordination.
Documentation

Demo video: View on GitHub (12MB)


What This Is

Decapod is agent infrastructure: a governance runtime for autonomous software agents.

Like Docker is a runtime for containers, Decapod is a runtime for agents. It gives autonomy a place to live that isn’t your chat window: persistent state, binding methodology, proof gates, and coordination primitives—so agent work becomes shippable, not scary.

Agents can write code. But they can’t reliably ship because they:

  • forget what they built yesterday (no persistence)
  • treat best practices as vibes (no enforcement)
  • say “done” without evidence (no proof gates)
  • trip over each other in parallel (no coordination)

You set Decapod up once (decapod init), then agents operate inside the governed environment. You don’t touch the internals—just like you don’t touch individual neurons.

What This Is Not

Decapod is not:

  • a prompt pack
  • an agent framework/library
  • a hosted SaaS “agent platform”
  • a review bot that just comments on PRs
  • a tool you manually operate (it’s agent infrastructure)

Decapod is an environment: the place agent work becomes enforceable.


Getting Started

cargo install decapod
cd <your-project>
decapod init

From that point on, agents operate inside the governed environment. You observe outcomes, review summaries, and merge when proofs pass.


Who This Is For

✅ You’re shipping production code with AI agents ✅ You want discipline enforced by the environment ✅ You want parallel agents without turning the repo into lore ✅ You merge to main (not just demoing prompts) ✅ You want an AI companion for building premium software ✅ You want “AI vibes” with guardrails and customizable enforceable workflows


Security

Decapod is designed with security at the foundation. See SECURITY.md for:

  • Credential architecture and lifecycle management
  • Agent identity and session security
  • Supply chain integrity
  • Incident response philosophy

TL;DR: Agents must handle credentials securely—never log, never commit, always rotate. Violations are constitutional breaches.


How It Works

1) Persistent State (Memory That Survives)

Agents persist work to .decapod/: todos, conventions, decisions, proof events—durable state that survives sessions and model switches.

You get continuity without re-explaining. Agents get a real memory substrate instead of fragile chat history.

2) Enforced Methodology (Constitution as Code)

Decapod ships an embedded constitution: binding contracts for how agents must operate (intent-first flow, authority chains, proof doctrine, store separation, etc).

Generated entrypoints (CLAUDE.md, AGENTS.md, GEMINI.md) require agents to:

  • read the constitution before acting
  • use the control surface for state mutation (no internal access)
  • follow Intent → Architecture → Implementation → Proof
  • pass validation gates before claiming “done”

Projects override behavior via .decapod/OVERRIDE.md without forking the constitution.

3) Proof Gates (Validation Before Promotion)

Promotion isn’t a vibe. It’s a check that can fail.

Agents must satisfy proof gates before completion is credible. If validation fails, the work isn’t done—no matter how confident the summary sounds. Evidence required, not assertions.

4) Coordination Primitives (So Parallel Doesn’t Mean Chaos)

Decapod standardizes the surfaces agents use to collaborate:

  • a shared backlog with audit trail
  • shared conventions and preferences
  • shared rationale (decisions, constraints, invariants)
  • a proof ledger (what passed, what failed, when, and why)
  • policy boundaries (trust tiers, risk zones)
  • (planned) safe multi-writer state via a DB broker

Multiple agents can work in parallel without collisions, duplicate effort, or lost context.


The Difference

Without Decapod:

You: “Add OAuth to the login flow”
Agent: Writes 500 lines across 8 files
You: Review everything manually
You: Find broken tests, ignored conventions, missing error paths
Agent: Forgets context when you ask for fixes

With Decapod:

You: “Add OAuth to the login flow”
Agent: Checks recorded conventions and constraints
Agent: Produces tracked work, records decisions
Agent: Runs proof gates, fixes failures, re-validates
Agent: Marks work done with an auditable trail
You: Review summary and merge

Architecture

┌──────────────────────────────────────────┐
│  Agent Entrypoints (CLAUDE.md, etc)     │  ← Generated by init
├──────────────────────────────────────────┤
│  Control Surface (stable interface)     │  ← Agents interact here
├──────────────────────────────────────────┤
│  Subsystems (plugin-grade surfaces)     │  ← Domain logic
├──────────────────────────────────────────┤
│  Governance Core (validate + doctrine)  │  ← Enforcement layer
├──────────────────────────────────────────┤
│  State Layer (SQLite + event logs)      │  ← Persistence + audit
├──────────────────────────────────────────┤
│  Embedded Constitution (methodology)    │  ← Contracts, not tips
└──────────────────────────────────────────┘

Storage:
<your-project>/
└── .decapod/
    ├── data/         # State (agents write via control surface)     !! DO NOT TOUCH
    ├── generated/    # Entrypoints + derived files (auto-managed)   !! DO NOT TOUCH
    └── OVERRIDE.md   # Edit this file to manually override any constitution contract layer. 

You don’t touch .decapod/data/ directly. Agents use the control surface. Like neurons—they’re there, they work, you don’t manipulate them individually.


Subsystems

Decapod's control surface is organized into 9 top-level commands with grouped subsystems. Agents interact with these; you communicate your desires to the agent and observe outcomes.

Status legend:

  • REAL = implemented and usable today
  • SPEC = designed/claimed, but not fully shipped yet

Core Commands

Command Purpose Status
decapod init Bootstrap project with constitution REAL
decapod setup Configure git hooks and repository setup REAL
decapod docs Constitution discovery and access REAL
decapod todo Work tracking with audit trail REAL
decapod validate Proof gate before promotion REAL

Governance (decapod govern)

Subcommand Purpose Status
policy Risk classification and approval gates REAL
health Proof ledger + system state monitoring REAL
health summary System health overview (replaces heartbeat) REAL
health autonomy Agent autonomy tiers (replaces trust) REAL
proof Executable verification and proof gates REAL
watcher Proactive integrity checks REAL
feedback User preference refinement REAL

Data Management (decapod data)

Subcommand Purpose Status
archive Session history indexing and verification REAL
knowledge Project facts and rationale storage REAL
context Token budget management and archival REAL
schema Subsystem schema discovery REAL
repo Repository structure mapping REAL
broker SQLite audit trail access REAL
teammate User conventions and preferences REAL

Automation (decapod auto)

Subcommand Purpose Status
cron Scheduled automation jobs REAL
reflex Event-driven triggers and actions REAL

Quality Assurance (decapod qa)

Subcommand Purpose Status
verify Proof replay and drift detection REAL
check CI validation checks REAL

Planned Enhancements

Feature Purpose Status
db_broker Multi-agent SQLite safety (write serialization) SPEC

Real-World Scenarios

Scenario 1: Preference Memory

You tell an agent once: “Always use SEMVER for tagging git commits, and set that value in the Cargo.toml and Cargo.lock before pushing.”

That preference becomes durable state. Every future agent session in this project can check it and will use it. You never explain again.

Scenario 2: Multi-Agent Feature Work

Work is tracked. Agents claim separate items, operate in parallel, and must pass proof gates before marking work done. No duplicate effort. No coordination bugs. No lost context.

Scenario 3: Proof-Gated Promotion

An agent thinks it’s done. Proof gates fail. It can’t credibly claim completion until it fixes the failures and re-validates. That’s the difference between autonomy and theater.


Ecosystem Status

Real today (foundation):

  • Local-first repo runtime (initialize once, agents use it)
  • Constitution routing + discovery (agents read, projects override)
  • Proof gates (validation must pass)
  • Core subsystems operational

In progress:

  • DB Broker (multi-agent safe writes)
  • Handoff/context passing surfaces

Planned:

  • Trust automation (earn autonomy through proof history)
  • Policy DSL (risk zones with approvals)
  • Pattern learning (conventions inferred from repo)