beans
Task tracker for AI agents.
Markdown files that track dependencies and require verification to close.
Verify commands must fail first by default — proving the test is real, not assert True:
# Test must FAIL first (proves it tests something real)
# Then passes after implementation → bean closes
Plain markdown files. No SDK, no API — any agent that can read files and run shell commands already speaks beans.
Table of Contents
- Install
- Quick Start
- Features
- How It Works
- Fail-First: Enforced TDD
- Failure History
- Hierarchical Tasks
- Smart Dependencies
- Interactive Mode
- Pipe-Friendly CLI
- Core Commands
- Agent Orchestration
- Agent Workflow
- Memory System
- MCP Server
- Adversarial Review
- Configuration
- Shell Completions
- Why Not X?
- Design Principles
- For Agents
- Documentation
- License
Install
&&
Quick Start
Orchestrate agents:
Capture project knowledge:
Features
- Verification gates — fail-first TDD, must-pass-to-close
- Failure history — attempts tracked, output appended to notes
- Hierarchical tasks — dot notation, parent/child, auto-close parent when all children done
- Smart dependencies —
produces/requiresauto-inference, cycle detection - Agent orchestration —
bn rundispatches beans to agents,bn plandecomposes large tasks - Adversarial review —
bn run --reviewspawns a second agent to verify correctness after close - Agent-agnostic — works with any CLI agent (Claude, pi, aider, custom scripts)
- Memory system —
bn factfor verified project knowledge with TTL and staleness detection - MCP server —
bn mcp servefor IDE integration (Cursor, Windsurf, Claude Desktop, Cline) - Interactive wizard —
bn createwith no args launches a step-by-step prompt (fuzzy parent search, smart verify suggestions, $EDITOR for descriptions) - Pipe-friendly —
--jsonoutput,--idslisting,--description -reads stdin,--stdinfor batch operations - Smart selectors —
@latestfor chaining sequential beans - Context assembly —
bn context <id>outputs a complete agent briefing: bean spec, verify command, previous attempts, project rules, dependency context, and referenced file contents - Trace —
bn tracewalks bean lineage, dependencies, artifacts, and attempt history - Dependency graph — ASCII, Mermaid, DOT output
- Full lifecycle — create, claim, close, reopen, delete, adopt, archive, unarchive, tidy
- Doctor — health checks for orphans, cycles, index freshness (with
--fix) - Editor support —
bn editwith backup/rollback - Hooks — pre-close hooks with trust system
- Shell completions — bash, zsh, fish, PowerShell
- Config inheritance —
extendsfor shared config across projects - Stateless — no daemon, no background sync, just files and a CLI
How It Works
Tasks are Markdown files with YAML frontmatter:
.beans/
├── 1-fix-auth-bug.md # Task 1
├── 2-add-tests.md # Task 2
├── 2.1-unit-tests.md # Task 2.1 (child of 2)
└── archive/2026/01/ # Closed tasks auto-archive
A bean looks like:
---
id: "1"
title: Fix authentication bug
status: in_progress
verify: cargo test auth::login
attempts: 0
---
The login endpoint returns 500 when password contains special chars.
**Files:** src/auth/login.rs, tests/auth_test.rs
The verify field is the contract. When you run bn close 1:
- Beans runs
cargo test auth::login - Exit 0 → task closes, moves to archive
- Exit non-zero → task stays open, failure appended to notes, ready for another agent
Fail-First: Enforced TDD
Agents can write "cheating tests" that prove nothing:
assert True # Always passes!
Fail-first is on by default. Before creating a bean, the verify command runs and must fail:
- If it passes → bean is rejected ("test doesn't test anything new")
- If it fails → bean is created (test is real)
- After implementation,
bn closeruns verify → must pass
REJECTED (cheating test):
$ bn quick "..." --verify "python -c 'assert True'"
error: Cannot create bean: verify command already passes!
ACCEPTED (real test):
$ bn quick "..." --verify "pytest test_unicode.py"
✓ Verify failed as expected - test is real
Created bean 5
Use --pass-ok / -p to skip fail-first for refactoring, hardening, and builds where the verify should already pass:
The failing test is the spec. The passing test is the proof. No ambiguity.
Failure History
When verify fails, beans appends the error output to the bean's notes:
---
id: "3"
title: Fix unicode URLs
status: open
verify: pytest test_urls.py
attempts: 2
---
Handle unicode characters in URL paths.
## Attempt 1 — 2024-01-15T14:32:00Z
Exit code: 1
FAILED test_urls.py::test_unicode_path AssertionError: Expected '/café' but got '/caf%C3%A9'
## Attempt 2 — 2024-01-15T15:10:00Z
Exit code: 1
FAILED test_urls.py::test_unicode_path UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3
- No lost context. When Agent A times out, Agent B sees exactly what failed.
- No repeated mistakes. Agent B can see "encoding was tried, didn't work" and try a different approach.
- Human debugging.
bn show 3reveals the full history without digging through logs.
Output is truncated to first 50 + last 50 lines to keep beans readable while preserving the error message and stack trace. There's no attempt limit — agents can retry indefinitely.
Hierarchical Tasks
Parent-child via dot notation:
#> Created: 1
#> Created: 1.1
#> Created: 1.2
#> [ ] 1. Auth system
#> [ ] 1.1 Login endpoint
#> [ ] 1.2 Token refresh
Smart Dependencies
Dependencies auto-infer from produces/requires:
When the JWT bean requires AuthProvider and the auth types bean produces it, JWT is automatically blocked until auth types closes. No explicit bn dep add needed.
#> ## Ready (1)
#> 1.1 [ ] Define auth types # ready (no requires)
#> ## Ready (1)
#> 1.2 [ ] Implement JWT # now ready (producer closed)
Children can be created in any order without manual dependency wiring.
Sequential Chaining
Use bn create next to chain beans that depend on the most recently created one:
Each next bean automatically depends on the previous one.
Interactive Mode
Run bn create with no arguments to launch an interactive wizard:
$ bn create
Creating a new bean
? Title › fix auth timeout
✔ Parent (type to filter) › 3 — Auth system
✔ Verify command (empty to skip) · pytest tests/test_auth_timeout.py
✔ Acceptance criteria (empty to skip) · Timeout returns 408, not 500
✔ Priority · P1 (high)
✔ Open editor for description? · no
✔ Produces (comma-separated, empty to skip) ·
✔ Requires (comma-separated, empty to skip) ·
✔ Add labels? · no
─── Bean Summary ───────────────────────
Title: fix auth timeout
Parent: 3
Verify: cargo test auth::timeout
Acceptance: Timeout returns 408, not 500
Priority: P1
────────────────────────────────────────
? Create this bean? · yes
Created bean 3.4: fix auth timeout (2k tokens ✓)
The wizard activates when no title is provided and stderr is a TTY. Use -i / --interactive to force it even with partial flags:
Features:
- Fuzzy parent search — type to filter from existing beans
- Smart verify suggestion — auto-detects project type (Cargo.toml →
cargo test, package.json →npm test) - $EDITOR for descriptions — opens your editor with a template including parent context
- Summary + confirm — review before creating
- Pre-filled flags skip their prompts — non-interactive mode is unchanged
Pipe-Friendly CLI
Beans is a Unix citizen. Commands produce structured output and accept piped input.
JSON output
# Create and capture the bean ID
ID=
# Query beans as JSON
|
|
List formatting
Available format placeholders: {id}, {title}, {status}, {priority}, {parent}, {assignee}, {labels}
Stdin input
Use - to read field values from stdin:
# Pipe description from a file or command
|
# Pipe notes from build output
|
# Pipe acceptance criteria
|
Batch operations
# Close multiple beans via pipe
|
# Close beans matching a pattern
| |
# Create → immediately claim
| |
Composable pipelines
# Batch create and collect IDs
for; do
done |
# Export to TSV
# Find failing in-progress beans
| | \
|
Core Commands
# Task lifecycle
# Agent orchestration
# Querying
# Memory
# Dependencies
# MCP
# Housekeeping
| Command | Purpose |
|---|---|
| Tasks | |
bn init |
Initialize .beans/ in current directory |
bn init --agent <preset> |
Initialize with agent preset (pi, claude, aider) |
bn init --setup |
Reconfigure agent on existing project |
bn create "title" |
Create a bean (--json for piped output, --paths for file refs) |
bn create |
Interactive wizard (auto-detects TTY) |
bn create -i |
Force interactive mode with any flags |
bn create next "title" |
Create bean depending on the last created bean |
bn quick "title" |
Create + claim in one step |
bn show <id> |
Full bean details (--short for one-line, --history for all) |
bn list |
List beans (--ids, --format, --json) |
bn edit <id> |
Edit bean in $EDITOR |
bn update <id> |
Update fields (--description - reads stdin) |
bn claim <id> |
Claim a task (--by to set who, --force to skip verify check) |
bn claim <id> --release |
Release a claim |
bn verify <id> |
Test without closing (--json for structured output) |
bn close <id> |
Close (verify must pass, --stdin for batch) |
bn close --failed <id> |
Mark attempt failed, release claim |
bn reopen <id> |
Reopen a closed bean |
bn delete <id> |
Delete a bean |
| Querying | |
bn status |
Overview: claimed, ready, goals, blocked |
bn context [id] |
Complete agent context (with ID) or memory context (without) |
bn context --structure-only |
Structural summary only (signatures, imports) |
bn tree |
View hierarchy |
bn graph |
Dependency graph (ASCII, Mermaid, DOT) |
bn trace <id> |
Walk lineage, deps, artifacts, attempts |
bn recall "query" |
Search beans by keyword (--all includes closed) |
| Memory | |
bn fact "title" --verify "cmd" |
Create verified fact with TTL |
bn verify-facts |
Re-verify all facts, detect staleness |
| Agents | |
bn run [id] [-j N] |
Dispatch ready beans to agents |
bn run --loop-mode |
Keep running until no ready beans remain |
bn run --auto-plan |
Auto-decompose large beans before dispatch |
bn run --dry-run |
Preview dispatch plan without spawning |
bn run --review |
Adversarial review after each close |
bn run --json-stream |
Emit JSON events to stdout |
bn plan [id] [--auto] |
Decompose a large bean into children |
bn plan --dry-run |
Preview split without creating |
bn review <id> |
Adversarial review of a bean's implementation |
bn agents [--json] |
Show running/completed agents |
bn logs <id> |
View agent output (-f to follow, --all for all runs) |
| MCP | |
bn mcp serve |
Start MCP server (JSON-RPC 2.0 on stdio) |
| Dependencies | |
bn dep add/remove/list |
Dependency management |
| Housekeeping | |
bn adopt <parent> <children> |
Adopt beans as children |
bn stats |
Project statistics |
bn tidy |
Archive closed, release stale, rebuild |
bn tidy --dry-run |
Preview what tidy would do |
bn sync |
Force rebuild index |
bn doctor |
Health check |
bn doctor --fix |
Auto-fix detected issues |
bn config get/set |
Project configuration |
bn trust |
Manage hook trust |
bn unarchive <id> |
Restore archived bean |
bn locks |
View file locks (--clear to force-clear) |
| Shell | |
bn completions <shell> |
Generate completions (bash, zsh, fish, powershell) |
Agent Orchestration
Beans has built-in agent orchestration. Configure your agent once, then dispatch beans to it:
# Configure during init (interactive wizard)
# Or set manually
{id} is replaced with the bean ID. The spawned agent should read the bean, do the work, and run bn close.
Dispatching work
bn run finds ready beans, sizes them, and spawns agents. Small beans get implemented directly. Large beans (exceeding max_tokens) are sent to the plan command for decomposition — or handled automatically with --auto-plan.
Loop mode
Keep dispatching until all work is done:
Planning large tasks
Monitoring
Discover-and-delegate
While working on your main task, create beans for everything you notice — bn run picks them up automatically:
Failure handling
Control what happens when a verify command fails:
An agent can also explicitly give up:
Agent presets
Or configure directly:
Agent Workflow
Automated (recommended)
Let bn run handle the full cycle — find ready beans, size them, dispatch agents, track results:
Agents are spawned with the configured run command. Each agent reads the bean, implements the work, and runs bn close. If verify fails, the task stays open with attempts incremented and the failure output appended to notes. bn run picks it up again on the next cycle.
Manual
Agents can also claim and work beans directly:
#> ## Ready (2)
#> 3 [ ] Implement token refresh
#> 7 [ ] Add rate limiting
# ... implement the feature ...
If verify fails, the task stays open with attempts: 1 and the failure output appended to notes. Another agent picking up the task sees what was tried and why it failed.
Trace context
Before working on a bean, use bn trace to understand its full context:
Memory System
Facts are verified truths about your project that persist across agent sessions. Each fact has a verify command that proves it's still true, and a TTL (default 30 days) after which it becomes stale.
Creating facts
Facts follow fail-first by default — the verify must fail initially. Use -p if the fact is already true:
Checking staleness
Context
bn context <id> outputs the complete agent briefing — everything needed to implement a bean:
The output includes (in order):
- Bean spec — ID, title, verify command, priority, status, description, acceptance criteria
- Previous attempts — what was tried and why it failed
- Project rules — conventions from
.beans/RULES.md - Dependency context — descriptions of sibling beans that produce artifacts this bean requires
- File structure — function signatures and imports
- File contents — full source of referenced files
File paths come from two sources: the bean's explicit paths field (set via --paths on create) and paths regex-extracted from the description text. Explicit paths take priority.
Without an ID, outputs memory context — project-wide state for orientation:
Searching
Adversarial Review
After closing a bean, spawn a second agent to verify the implementation is correct:
The review agent outputs a verdict:
- approve — labels bean as
reviewed - request-changes — reopens bean with review notes, labels
review-failed - flag — labels bean
needs-human-review, stays closed
Configure the review agent:
Falls back to the global run template if review.run is not set.
MCP Server
Beans includes an MCP (Model Context Protocol) server for IDE integration. This lets tools like Cursor, Windsurf, Claude Desktop, and Cline interact with beans directly.
Add to your IDE's MCP configuration to expose bean operations (create, list, show, close, etc.) as tools your IDE's AI can call.
Configuration
Agent orchestration and project settings are configured via bn config:
| Key | Default | Description |
|---|---|---|
run |
(none) | Command template to implement a bean. {id} is replaced with the bean ID. |
plan |
(none) | Command template to decompose a large bean into children. |
max_concurrent |
4 |
Maximum number of agents running in parallel. |
max_tokens |
30000 |
Maximum tokens for bean context (triggers planning if exceeded). |
max_loops |
10 |
Maximum agent loops before stopping (0 = unlimited). |
poll_interval |
30 |
Seconds between loop mode poll cycles. |
auto_close_parent |
true |
Auto-close parent beans when all children are closed/archived. |
verify_timeout |
(none) | Default timeout in seconds for verify commands. Per-bean --verify-timeout overrides. |
rules_file |
(none) | Path to project rules file (relative to .beans/). Contents injected into bn context. |
file_locking |
false |
Lock files listed in bean paths to prevent concurrent agents from clobbering. |
extends |
[] |
Paths to parent config files to inherit from (supports ~/). |
on_close |
(none) | Hook: shell command after successful close. Vars: {id}, {title}, {status}, {branch}. |
on_fail |
(none) | Hook: shell command after verify failure. Vars: {id}, {title}, {attempt}, {output}, {branch}. |
post_plan |
(none) | Hook: shell command after bn plan creates children. Vars: {id}, {parent}, {children}, {branch}. |
review.run |
(none) | Command template for adversarial review agent. Falls back to run if unset. |
review.max_reopens |
2 |
Max times review can reopen a bean before giving up. |
Config is stored in .beans/config.yaml and checked into git with your project.
Config inheritance
Share config across projects with extends:
# .beans/config.yaml
extends:
- ~/.beans/global-config.yaml
project: my-app
run: "claude -p 'implement bean {id}'"
Child config values override parent values. Multiple parents are applied in order (last wins).
Shell Completions
Generate completions for your shell:
# Bash
# Zsh
# Fish
# PowerShell
Why Not X?
| beans | beads | Jira/Linear | GitHub Issues | |
|---|---|---|---|---|
| Designed for | AI agents | AI agents | Humans | Humans |
| Verify gates | ✓ Enforced | ✗ Honor system | ✗ Honor system | ✗ Honor system |
| Storage | Markdown files | JSONL + SQLite | Cloud DB | Cloud DB |
| Hierarchy | 3.1 = child of 3 |
Flat (hash IDs) | Epics/stories | Flat |
| Git integration | Native (in repo) | External | External | Same platform |
| Offline | ✓ Full | ✓ Full | Limited | Limited |
Inspired by Steve Yegge's beads — beans trades scale for simplicity and enforced verification.
Design Principles
- Files are the source of truth. The index is a cache.
bn show,bn list, or read the files directly. - Verify gates are the default. If you can't prove it's done, it's not done. (
--forceexists as an escape hatch.) - Fail-then-pass. Tests must fail before work starts, pass after. No
assert True. - Failures accumulate. Each failed attempt appends to the bean. Next agent sees full history.
- Agent-agnostic. Beans doesn't care what agent you use. Configure any CLI agent via
bn config set run— Claude, pi, aider, or a shell script. The protocol is simple: read the bean, do the work,bn close. - Stateless CLI. No daemon, no background sync. Each command reads files, acts, exits.
- Hierarchy in filenames.
3.2is obviously a child of3. No metadata lookup needed. - Git-native. Clean diffs, meaningful history, works offline.
For Agents
You know what's exhausting? Vague tasks. "Improve the auth flow." Done... how? When the human squints at your PR and says "not quite what I meant," you've wasted tokens and time. No spec, no proof, no defense.
Beans fixes this.
Every bean has a verify command — a test that must fail when you create the task and must pass when you close it. Not "looks good to me." Not "I think it works." The test passes or it doesn't. You're not done until the machine says you're done.
No more assert True. No more lost context. No more ambiguity. The verify command is the contract. Hit it and you're done. Miss it and you're not.
Tasks are just markdown files. bn show 3. No API, no auth, no waiting.
Documentation
- Agent Skill — Quick reference for AI agents using beans
- Best Practices — Writing effective beans for agents
bn --help— Full command reference
Contributing
Contributions are welcome. Fork the repo, create a feature branch, and open a pull request.