agentic-codebase 0.3.0

AI agents can't understand code across sessions.

Your agent reads a file, analyzes one function, and forgets. Next session -- blank slate. It can't recall the architecture it mapped yesterday. It can't trace the impact chain from three conversations ago. It can't search its own understanding of your codebase.

RAG over source files doesn't work. You get "similar text," never "what breaks if I change this?". Embedding chunks loses all structure -- no call graphs, no dependency chains, no type relationships. Grep is fast but flat.

AgenticCodebase compiles your repository into a navigable concept graph stored in a single binary file. Not "search your source code." Your agent has a map -- functions, classes, modules, imports, call chains, type hierarchies -- all connected, all queryable in microseconds.

Problems Solved (Read This First)

Problem: AI coding sessions reset to zero and lose project understanding.
Solved: .acb stores persistent semantic structure so the next session resumes with context.
Problem: "What breaks if I change this?" is guesswork.
Solved: native impact analysis over typed dependency and call graphs.
Problem: text search finds strings, not system behavior.
Solved: graph-native queries for symbols, relationships, risk, and likely breakpoints.
Problem: cross-project work becomes brittle and manual.
Solved: compile each repo into its own graph artifact and query them independently.
Problem: MCP coding clients see raw files but not deep code semantics.
Solved: agentic-codebase-mcp exposes the graph to MCP clients for structured code intelligence.

# Compile any repository (Python, Rust, TypeScript, JavaScript, Go, C++, Java, C#)
acb compile ./my-project -o project.acb --coverage-report coverage.json

# Query it
acb query project.acb symbol --name "UserService"     # Find symbols
acb query project.acb impact --unit-id 42              # What breaks?
acb query project.acb prophecy --limit 10              # What will break next?

Eight languages. Twenty-four query types. One file holds everything. Sub-microsecond lookups. Works with Claude Desktop, VS Code, Cursor, Windsurf, and any MCP-compatible client.

Language Support

Language	Extensions	Status
Python	`.py`	Full
Rust	`.rs`	Full
TypeScript	`.ts`, `.tsx`	Full
JavaScript	`.js`, `.jsx`, `.mjs`	Full
Go	`.go`	Full
C++	`.cpp`, `.cc`, `.cxx`, `.h`, `.hpp`	New in v0.2.4
Java	`.java`	New in v0.2.5
C#	`.cs`	New in v0.2.6

Ghost Writer

New in v0.2.4 -- Auto-syncs codebase context to your AI coding tools.

Client	Config Location	Status
Claude Code	`~/.claude/memory/CODEBASE_CONTEXT.md`	Full support
Cursor	`~/.cursor/memory/agentic-codebase.md`	Full support
Windsurf	`~/.windsurf/memory/agentic-codebase.md`	Full support
Cody	`~/.sourcegraph/cody/memory/agentic-codebase.md`	Full support

Syncs: loaded graphs, recent symbol lookups, analysis findings. Zero configuration.

Better Skip Messaging

New in v0.2.6 -- When files can't be parsed, you now see WHY:

Unsupported: 1913 files [.xml(800), .txt(400), .md(300), ...]

Instead of just skipped: 1913.

V2: Grounding & Multi-Context Workspaces

Grounding (anti-hallucination) -- agents cannot claim code exists without graph backing. Three new MCP tools (codebase_ground, codebase_evidence, codebase_suggest) verify symbol claims, return evidence with file paths and line numbers, and suggest similar symbols when a claim is ungrounded.

Multi-context workspaces -- load multiple .acb files simultaneously and query across them. Six workspace tools (workspace_create, workspace_add, workspace_list, workspace_query, workspace_compare, workspace_xref) plus three translation tools (translation_record, translation_progress, translation_remaining) for cross-repo migration tracking.

# Grounding: verify a code claim before asserting it
agentic-codebase-mcp  # codebase_ground { "claim": "UserService has method getProfile" }

# Workspace: compare symbols across two repos
agentic-codebase-mcp  # workspace_create { "name": "migration" }
         # workspace_add { "workspace_id": "...", "path": "old.acb", "role": "primary" }
         # workspace_add { "workspace_id": "...", "path": "new.acb", "role": "secondary" }
         # workspace_compare { "workspace_id": "...", "item": "UserService" }

Benchmarks

Rust core. Tree-sitter parsing. Binary .acb format. Real numbers from cargo bench --release:

Operation	1K units	10K units	Notes
Graph build	388 us	3.77 ms	Semantic analysis + edge resolution
Write .acb	169 us	2.29 ms	LZ4-compressed binary format
Read .acb	473 us	4.91 ms	Memory-mapped I/O

Query (10K graph)	Latency	Notes
Symbol lookup (exact)	14.3 us	Hash-based, O(1)
Dependency graph (depth 5)	925 ns	BFS traversal
Impact analysis	1.46 us	With risk scoring
Call graph (depth 3)	1.27 us	Bidirectional

All benchmarks on Apple M4 Pro, macOS, Rust 1.90.0 --release. Criterion 0.5 with 100 iterations after warm-up.

Why AgenticCodebase?

Approach	Finds symbols	Traces dependencies	Predicts impact	Persists across sessions	Sub-ms queries
grep / ripgrep	partial	no	no	no	yes
LSP / IDE	yes	partial	no	no	varies
RAG over source	partial	no	no	yes	no
AgenticCodebase	yes	yes	yes	yes	yes

Quickstart

Install

cargo install agentic-codebase-cli agentic-codebase-mcp

Compile a codebase

# Parse and compile a repository into a .acb graph
acb compile ./my-project -o project.acb --coverage-report coverage.json

# View graph metadata
acb info project.acb

# Query symbols
acb query project.acb symbol --name "UserService"

# Impact analysis -- what breaks if I change unit 42?
acb query project.acb impact --unit-id 42 --depth 5

# Code prophecy -- what's likely to break next?
acb query project.acb prophecy --limit 10

# Graph-wide health summary
acb health project.acb

# CI-style risk gate for a change candidate
acb gate project.acb --unit-id 42 --max-risk 0.60 --require-tests

MCP Server

# Start the MCP server (stdio transport)
agentic-codebase-mcp

agentic-codebase-mcp accepts both line-delimited JSON-RPC and Content-Length framed MCP stdio messages.

Configure in Claude Desktop (claude_desktop_config.json):

{
  "mcpServers": {
    "agentic-codebase": {
      "command": "agentic-codebase-mcp",
      "args": ["serve"]
    }
  }
}

Configure VS Code / Cursor

Add to .vscode/settings.json:

{
  "mcp.servers": {
    "agentic-codebase": {
      "command": "agentic-codebase-mcp",
      "args": ["serve"]
    }
  }
}

See the Full Install Guide for Windsurf and other client configuration.

Common Workflows

Pre-refactor safety check -- Before changing a function, see all callers, tests, and downstream dependencies:
```
acb query project.acb impact --unit-id 42 --depth 5
```
Find hidden coupling -- Before splitting a module, discover non-obvious dependencies between units:
```
acb query project.acb coupling
```
Assess refactor risk -- Before a large migration, predict which units are most likely to break:
```
acb query project.acb prophecy --limit 10
```
Verify test coverage gaps -- In CI, find public functions without test edges:
```
acb query project.acb test-gap
```

Validation

This isn't a prototype. It's tested beyond what most production systems require.

Suite	Tests
Rust core engine	38	Unit tests
Integration tests	460	Multi-phase integration coverage
V2 stress tests	69	Grounding, workspaces, translation
Benchmarks	21	Criterion statistical benchmarks
Total	567	All passing, 0 Clippy warnings

One research paper:

Paper I: AgenticCodebase -- Semantic Compiler (7 pages)

The Query Engine

AgenticCodebase provides 24 query types across three tiers:

Core Queries (8)

Query	CLI	Description
Symbol lookup	`acb query ... symbol -n <name>`	Find code units by name (exact, prefix, contains)
Dependency graph	`acb query ... deps -u <id>`	Forward dependencies with depth control
Reverse dependency	`acb query ... rdeps -u <id>`	Who depends on this unit?
Call graph	`acb query ... calls -u <id>`	Function call chains (callers + callees)
Similarity	`acb query ... similar -u <id>`	Structurally similar code units
Type hierarchy	via library API	Inheritance and implementation chains
Containment	via library API	Module/class nesting relationships
Pattern match	via library API	Structural code pattern detection

Built Queries (5)

Query	CLI	Description
Impact analysis	`acb query ... impact -u <id>`	Risk-scored change impact with test coverage
Coverage	via library API	Test coverage mapping
Trace	via library API	Execution path tracing
Path	via library API	Shortest path between two units
Reverse	via library API	Reverse call/dependency chains

Novel Queries (11)

Query	CLI	Description
Prophecy	`acb query ... prophecy`	Predict which units will break next
Stability	`acb query ... stability -u <id>`	Stability score with contributing factors
Coupling	`acb query ... coupling`	Detect tightly coupled unit pairs
Collective	via library API	Cross-repository pattern extraction
Temporal	via library API	Git history evolution analysis
Dead code	`acb query ... dead-code`	Unreachable code detection
Concept	via library API	Abstract concept clustering
Migration	via library API	Language migration planning
Test gap	`acb query ... test-gap`	Missing test identification
Drift	via library API	Code drift detection over time
Hotspot	`acb query ... hotspots`	Change frequency hotspot analysis

How It Works

AgenticCodebase compiles source code into a semantic graph, writes it to a portable .acb binary, and serves query + MCP surfaces on top of that graph.

Architecture

AgenticCodebase models source code as a directed graph G = (U, E) where each vertex is a typed CodeUnit and each edge carries a semantic relationship.

Compilation Pipeline

Source Files (Py/Rust/TS/JS/Go/C++/Java/C#)
    -> tree-sitter Parse
    -> Semantic Analysis
    -> Graph Builder
    -> .acb Binary

Binary Format (.acb)

Section	Size	Description
Header	128 B	Magic, version, counts, offsets
Unit Table	96N B	Fixed-size unit records (O(1) access)
Edge Table	40M B	Fixed-size edge records
String Pool	Variable	LZ4-compressed names and paths
Feature Vectors	Variable	f32 embedding arrays

Code Unit Types (13)

Function, Method, Class, Struct, Enum, Interface, Trait, Module, Import, Variable, Constant, TypeAlias, Macro

Edge Types (18)

Calls, CalledBy, Imports, ImportedBy, Contains, ContainedBy, Inherits, InheritedBy, Implements, ImplementedBy, Uses, UsedBy, Returns, Accepts, Overrides, OverriddenBy, Tests, TestedBy

Install

cargo install agentic-codebase-cli agentic-codebase-mcp

Install script (binary release + MCP config merge):

curl -fsSL https://agentralabs.tech/install/codebase | bash

Environment profiles (one command per environment):

# Desktop MCP clients (auto-merge Claude Desktop + Claude Code when detected)
curl -fsSL https://agentralabs.tech/install/codebase/desktop | bash

# Terminal-only (no desktop config writes)
curl -fsSL https://agentralabs.tech/install/codebase/terminal | bash

# Remote/server hosts (no desktop config writes)
curl -fsSL https://agentralabs.tech/install/codebase/server | bash

Channel	Command	Result
crates.io (official)	`cargo install agentic-codebase-cli agentic-codebase-mcp`	Installs `acb` and `agentic-codebase-mcp`
GitHub installer (official)	`curl -fsSL https://agentralabs.tech/install/codebase \| bash`	Installs release binaries when available, otherwise source fallback; merges MCP config
GitHub installer (desktop profile)	`curl -fsSL https://agentralabs.tech/install/codebase/desktop \| bash`	Explicit desktop profile behavior
GitHub installer (terminal profile)	`curl -fsSL https://agentralabs.tech/install/codebase/terminal \| bash`	Installs binaries only; no desktop config writes
GitHub installer (server profile)	`curl -fsSL https://agentralabs.tech/install/codebase/server \| bash`	Installs binaries only; server-safe behavior
npm (wasm)	`npm install @agenticamem/codebase`	WASM-based codebase SDK for Node.js and browser

Server auth and artifact sync

For cloud/server runtime:

export AGENTIC_TOKEN="$(openssl rand -hex 32)"

All MCP clients must send Authorization: Bearer <same-token>. If .acb/.amem/.avis files are on another machine, sync them to the server first.

Deployment Model

Standalone by default: AgenticCodebase is independently installable and operable. Integration with AgenticMemory or AgenticVision is optional, never required.
Autonomic operations by default: compile/runtime maintenance uses safe profile-based defaults with rolling backups, migration safeguards, collective cache maintenance, and health-ledger snapshots.

Area	Default behavior	Controls
Autonomic profile	Conservative local-first posture	`ACB_AUTONOMIC_PROFILE=desktop
Rolling backup	Compile writes checkpointed backups for existing outputs	`ACB_AUTO_BACKUP`, `ACB_AUTO_BACKUP_RETENTION`, `ACB_AUTO_BACKUP_DIR`
Storage migration	Policy-gated with checkpointed auto-safe path	`ACB_STORAGE_MIGRATION_POLICY=auto-safe
Storage budget policy	20-year projection + backup rollup when budget pressure appears	`ACB_STORAGE_BUDGET_MODE=auto-rollup
Collective cache maintenance	Periodic expiry cleanup of collective cache entries	`ACB_COLLECTIVE_CACHE_MAINTENANCE_SECS`
Maintenance throttling	SLA-aware under sustained registry load	`ACB_SLA_MAX_REGISTRY_OPS_PER_MIN`
Health ledger	Periodic operational snapshots (default: `~/.agentra/health-ledger`)	`ACB_HEALTH_LEDGER_DIR`, `AGENTRA_HEALTH_LEDGER_DIR`, `ACB_HEALTH_LEDGER_EMIT_SECS`

See the Full Installation Guide for all options including MCP server setup, library usage, and build-from-source instructions.

Integration with Agentic Ecosystem

AgenticCodebase is part of the Agentic ecosystem:

AgenticMemory -- Persistent, navigable memory for AI agents
AgenticVision -- Visual memory and image understanding for AI agents
AgenticCodebase -- Semantic code understanding for AI agents
AgenticIdentity -- Verifiable trust, receipts, and competence for AI agents

All four share the MCP protocol for seamless AI agent integration. Run all four servers together for an agent with memory, vision, code understanding, and verifiable identity.

Implementation

Language: Rust
Source files: 58
Lines of code: 13,709
Tests: 567 (38 unit + 460 integration + 69 V2 stress)
Benchmarks: 21 Criterion benchmarks
Clippy warnings: 0
Supported languages: Python, Rust, TypeScript, JavaScript, Go, C++, Java, C#

The .acb File

Your agent's code knowledge. Semantic understanding.


Size	~2-3 GB over 20 years
Format	Binary semantic graph
Works with	Any coding model

Repository Structure

This is a Cargo workspace monorepo containing the core library, CLI, MCP server, and FFI bindings.

agentic-codebase/
├── Cargo.toml                    # Workspace root
├── src/                          # Core library (crates.io: agentic-codebase)
├── crates/
│   ├── agentic-codebase-cli/     # CLI (crates.io: agentic-codebase-cli) — `acb` binary
│   ├── agentic-codebase-mcp/     # MCP server (crates.io: agentic-codebase-mcp)
│   └── agentic-codebase-ffi/     # FFI bindings (crates.io: agentic-codebase-ffi)
├── ffi/                          # Additional FFI support
├── python/                       # Python SDK (PyPI: agentic-codebase)
├── npm/                          # npm WASM package (@agenticamem/codebase)
├── tests/                        # Integration + stress tests
├── benches/                      # Criterion benchmarks
├── testdata/                     # Test fixtures
├── paper/                        # Research paper (Paper I: Semantic Compiler)
├── docs/                         # API reference, guides
├── examples/                     # Usage examples
└── scripts/                      # Build and release scripts

Running Tests

# All workspace tests (unit + integration + stress)
cargo test --workspace

# Core library only
cargo test -p agentic-codebase

# Benchmarks
cargo bench --workspace

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Privacy and Security

All graphs stay local in .acb files -- no telemetry, no cloud sync by default.
.acb files contain structural metadata (symbols, edges, relationships), not raw source code.
Gate checks (acb gate) enforce risk thresholds and test-coverage requirements before merges.
Server mode requires an explicit AGENTIC_TOKEN environment variable for bearer auth.
Budget governance prevents unbounded artifact growth with 20-year projection and backup rollup.

License

MIT -- see LICENSE for details.