decapod 0.3.0

Demo video: View on GitHub (12MB)

What This Is

Decapod is agent infrastructure: a governance runtime for autonomous software agents.

Like Docker is a runtime for containers, Decapod is a runtime for agents. It gives autonomy a place to live that isn’t your chat window: persistent state, binding methodology, proof gates, and coordination primitives—so agent work becomes shippable, not scary.

Agents can write code. But they can’t reliably ship because they:

forget what they built yesterday (no persistence)
treat best practices as vibes (no enforcement)
say “done” without evidence (no proof gates)
trip over each other in parallel (no coordination)

You set Decapod up once (decapod init), then agents operate inside the governed environment. You don’t touch the internals—just like you don’t touch individual neurons.

What This Is Not

Decapod is not:

a prompt pack
an agent framework/library
a hosted SaaS “agent platform”
a review bot that just comments on PRs
a tool you manually operate (it’s agent infrastructure)

Decapod is an environment: the place agent work becomes enforceable.

Getting Started

cargo install decapod
cd <your-project>
decapod init

From that point on, agents operate inside the governed environment. You observe outcomes, review summaries, and merge when proofs pass.

Who This Is For

✅ You’re shipping production code with AI agents ✅ You want discipline enforced by the environment ✅ You want parallel agents without turning the repo into lore ✅ You merge to main (not just demoing prompts) ✅ You want an AI companion for building premium software ✅ You want “AI vibes” with guardrails and customizable enforceable workflows

Security

Decapod is designed with security at the foundation. See SECURITY.md for:

Credential architecture and lifecycle management
Agent identity and session security
Supply chain integrity
Incident response philosophy

TL;DR: Agents must handle credentials securely—never log, never commit, always rotate. Violations are constitutional breaches.

How It Works

1) Persistent State (Memory That Survives)

Agents persist work to .decapod/: todos, conventions, decisions, proof events—durable state that survives sessions and model switches.

You get continuity without re-explaining. Agents get a real memory substrate instead of fragile chat history.

2) Enforced Methodology (Constitution as Code)

Decapod ships an embedded constitution: binding contracts for how agents must operate (intent-first flow, authority chains, proof doctrine, store separation, etc).

Generated entrypoints (CLAUDE.md, AGENTS.md, GEMINI.md) require agents to:

read the constitution before acting
use the control surface for state mutation (no internal access)
follow Intent → Architecture → Implementation → Proof
pass validation gates before claiming “done”

Projects override behavior via .decapod/OVERRIDE.md without forking the constitution.

3) Proof Gates (Validation Before Promotion)

Promotion isn’t a vibe. It’s a check that can fail.

Agents must satisfy proof gates before completion is credible. If validation fails, the work isn’t done—no matter how confident the summary sounds. Evidence required, not assertions.

4) Coordination Primitives (So Parallel Doesn’t Mean Chaos)

Decapod standardizes the surfaces agents use to collaborate:

a shared backlog with audit trail
shared conventions and preferences
shared rationale (decisions, constraints, invariants)
a proof ledger (what passed, what failed, when, and why)
policy boundaries (trust tiers, risk zones)
(planned) safe multi-writer state via a DB broker

Multiple agents can work in parallel without collisions, duplicate effort, or lost context.

The Difference

Without Decapod:

You: “Add OAuth to the login flow”
Agent: Writes 500 lines across 8 files
You: Review everything manually
You: Find broken tests, ignored conventions, missing error paths
Agent: Forgets context when you ask for fixes

With Decapod:

You: “Add OAuth to the login flow”
Agent: Checks recorded conventions and constraints
Agent: Produces tracked work, records decisions
Agent: Runs proof gates, fixes failures, re-validates
Agent: Marks work done with an auditable trail
You: Review summary and merge

Architecture

┌──────────────────────────────────────────┐
│  Agent Entrypoints (CLAUDE.md, etc)     │  ← Generated by init
├──────────────────────────────────────────┤
│  Control Surface (stable interface)     │  ← Agents interact here
├──────────────────────────────────────────┤
│  Subsystems (plugin-grade surfaces)     │  ← Domain logic
├──────────────────────────────────────────┤
│  Governance Core (validate + doctrine)  │  ← Enforcement layer
├──────────────────────────────────────────┤
│  State Layer (SQLite + event logs)      │  ← Persistence + audit
├──────────────────────────────────────────┤
│  Embedded Constitution (methodology)    │  ← Contracts, not tips
└──────────────────────────────────────────┘

Storage:
<your-project>/
└── .decapod/
    ├── data/         # State (agents write via control surface)     !! DO NOT TOUCH
    ├── generated/    # Entrypoints + derived files (auto-managed)   !! DO NOT TOUCH
    └── OVERRIDE.md   # Edit this file to manually override any constitution contract layer.

You don’t touch .decapod/data/ directly. Agents use the control surface. Like neurons—they’re there, they work, you don’t manipulate them individually.

Subsystems

Decapod's control surface is organized into 9 top-level commands with grouped subsystems. Agents interact with these; you communicate your desires to the agent and observe outcomes.

Status legend:

REAL = implemented and usable today
SPEC = designed/claimed, but not fully shipped yet

Core Commands

Command	Purpose	Status
`decapod init`	Bootstrap project with constitution	REAL
`decapod setup`	Configure git hooks and repository setup	REAL
`decapod docs`	Constitution discovery and access	REAL
`decapod todo`	Work tracking with audit trail	REAL
`decapod validate`	Proof gate before promotion	REAL

Governance (`decapod govern`)

Subcommand	Purpose	Status
`policy`	Risk classification and approval gates	REAL
`health`	Proof ledger + system state monitoring	REAL
`health summary`	System health overview (replaces heartbeat)	REAL
`health autonomy`	Agent autonomy tiers (replaces trust)	REAL
`proof`	Executable verification and proof gates	REAL
`watcher`	Proactive integrity checks	REAL
`feedback`	User preference refinement	REAL

Data Management (`decapod data`)

Subcommand	Purpose	Status
`archive`	Session history indexing and verification	REAL
`knowledge`	Project facts and rationale storage	REAL
`context`	Token budget management and archival	REAL
`schema`	Subsystem schema discovery	REAL
`repo`	Repository structure mapping	REAL
`broker`	SQLite audit trail access	REAL
`teammate`	User conventions and preferences	REAL

Automation (`decapod auto`)

Subcommand	Purpose	Status
`cron`	Scheduled automation jobs	REAL
`reflex`	Event-driven triggers and actions	REAL

Quality Assurance (`decapod qa`)

Subcommand	Purpose	Status
`verify`	Proof replay and drift detection	REAL
`check`	CI validation checks	REAL

Planned Enhancements

Feature	Purpose	Status
`db_broker`	Multi-agent SQLite safety (write serialization)	SPEC

Real-World Scenarios

Scenario 1: Preference Memory

You tell an agent once: “Always use SEMVER for tagging git commits, and set that value in the Cargo.toml and Cargo.lock before pushing.”

That preference becomes durable state. Every future agent session in this project can check it and will use it. You never explain again.

Scenario 2: Multi-Agent Feature Work

Work is tracked. Agents claim separate items, operate in parallel, and must pass proof gates before marking work done. No duplicate effort. No coordination bugs. No lost context.

Scenario 3: Proof-Gated Promotion

An agent thinks it’s done. Proof gates fail. It can’t credibly claim completion until it fixes the failures and re-validates. That’s the difference between autonomy and theater.

Ecosystem Status

Real today (foundation):

Local-first repo runtime (initialize once, agents use it)
Constitution routing + discovery (agents read, projects override)
Proof gates (validation must pass)
Core subsystems operational

In progress:

DB Broker (multi-agent safe writes)
Handoff/context passing surfaces

Planned:

Trust automation (earn autonomy through proof history)
Policy DSL (risk zones with approvals)
Pattern learning (conventions inferred from repo)