cxpak
Spends CPU cycles so you don't spend tokens. The LLM gets a briefing packet instead of a flashlight in a dark room.
A Rust CLI that indexes codebases using tree-sitter and produces token-budgeted context bundles for LLMs.
Installation
# Via Homebrew (macOS/Linux)
# Via cargo
Claude Code Plugin
cxpak ships as a Claude Code plugin — skills auto-trigger when you ask about codebase structure or changes, and slash commands give you direct control.
Install the plugin:
/plugin marketplace add Barnett-Studios/cxpak
/plugin install cxpak
Skills (auto-invoked):
| Skill | Triggers when you... |
|---|---|
codebase-context |
Ask about project structure, architecture, how components relate |
diff-context |
Ask to review changes, prepare a PR description, understand what changed |
Commands (user-invoked):
| Command | Description |
|---|---|
/cxpak:overview |
Generate a structured repo summary |
/cxpak:trace <symbol> |
Trace a symbol through the dependency graph |
/cxpak:diff |
Show changes with dependency context |
/cxpak:clean |
Remove .cxpak/ cache and output files |
The plugin auto-downloads the cxpak binary if it's not already installed.
Usage
# Structured repo summary within a token budget
# Write output to a file
# Focus on a specific directory (boosts ranking)
# Trace from a function/error, pack relevant code paths
# Trace with full dependency graph traversal
# Different output formats
# Show changes with dependency context (vs working tree)
# Diff against a specific ref
# Diff by time range
# Full dependency graph context
# Print pipeline timing info
# Clean cache and output files
Daemon Mode
With the daemon feature flag, cxpak can run as a persistent server with a hot index that updates on file changes.
# Install with daemon support
# Watch for file changes and keep index hot
# Start HTTP server (default port 3000)
# Start as MCP server over stdio
HTTP API
When running cxpak serve, these endpoints are available:
| Endpoint | Description |
|---|---|
GET /health |
Health check |
GET /stats |
Language stats and token counts |
GET /overview?tokens=50000 |
Structured repo summary |
GET /trace?target=handle_request |
Trace a symbol through dependencies |
GET /diff?git_ref=HEAD~1 |
Show changes with dependency context |
MCP Server
When running cxpak serve --mcp, cxpak speaks Model Context Protocol over stdin/stdout. It exposes seven tools (all support a focus path prefix parameter):
| Tool | Description |
|---|---|
cxpak_overview |
Structured repo summary |
cxpak_trace |
Trace a symbol through dependencies |
cxpak_stats |
Language stats and token counts |
cxpak_diff |
Show changes with dependency context |
cxpak_context_for_task |
Score and rank files by relevance to a task |
cxpak_pack_context |
Pack selected files into a token-budgeted bundle |
cxpak_search |
Regex search with context lines |
What You Get
The overview command produces a structured briefing with these sections:
- Project Metadata — file counts, languages, estimated tokens
- Directory Tree — full file listing
- Module / Component Map — files with their public symbols
- Dependency Graph — import relationships between files
- Key Files — full content of README, config files, manifests
- Function / Type Signatures — every public symbol's signature
- Git Context — recent commits, file churn, contributors
Each section has a budget allocation. When content exceeds its budget, it's truncated with the most important items preserved first.
Context Quality
cxpak applies intelligent context management to maximize the usefulness of every token:
Progressive Degradation — When content exceeds the budget, symbols are progressively reduced through 5 detail levels (Full → Trimmed → Documented → Signature → Stub). High-relevance files keep full detail while low-relevance dependencies are summarized. Selected files never degrade below Documented; dependencies can be dropped entirely as a last resort.
Concept Priority — Symbols are ranked by type: functions/methods (1.0) > structs/classes (0.86) > API surface (0.71) > configuration (0.57) > documentation (0.43) > constants (0.29). This determines degradation order — functions survive longest.
Query Expansion — When using context_for_task, queries are expanded with ~30 core synonym mappings (e.g., "auth" → authentication, login, jwt, oauth) plus 8 domain-specific maps (Web, Database, Auth, Infra, Testing, API, Mobile, ML) activated automatically by detecting file patterns in the repo.
Context Annotations — Each packed file gets a language-aware comment header showing its relevance score, role (selected/dependency), signal breakdown, and detail level. The LLM knows exactly why each file was included and how much detail it's seeing.
Chunk Splitting — Symbols exceeding 4000 tokens are split into labeled chunks (e.g., handler [1/3]) that degrade independently. Each chunk carries the parent signature for context.
Pack Mode
When a repo exceeds the token budget, cxpak automatically switches to pack mode:
- The overview stays within budget (one file, fits in one LLM prompt)
- A
.cxpak/directory is created with full untruncated detail files - Truncated sections in the overview get pointers to their detail files
repo/
.cxpak/
tree.md # complete directory tree
modules.md # every file, every symbol
dependencies.md # full import graph
signatures.md # every public signature
key-files.md # full key file contents
git.md # full git history
Detail file extensions match --format: .md for markdown, .json for json, .xml for xml.
The overview tells the LLM what exists. The detail files let it drill in on demand. .cxpak/ is automatically added to .gitignore.
If the repo fits within budget, you get a single file with everything — no .cxpak/ directory needed.
Caching
cxpak caches parse results in .cxpak/cache/ to speed up re-runs. The cache is keyed on file modification time and size — when a file changes, it's automatically re-parsed.
To clear the cache and all output files:
Supported Languages (42)
Tier 1 — Full extraction (functions, classes, methods, imports, exports): Rust, TypeScript, JavaScript, Python, Java, Go, C, C++, Ruby, C#, Swift, Kotlin, Bash, PHP, Dart, Scala, Lua, Elixir, Zig, Haskell, Groovy, Objective-C, R, Julia, OCaml, MATLAB
Tier 2 — Structural extraction (selectors, headings, keys, blocks, targets, etc.): CSS, SCSS, Markdown, JSON, YAML, TOML, Dockerfile, HCL/Terraform, Protobuf, Svelte, Makefile, HTML, GraphQL, XML
Database DSLs: SQL, Prisma
Tree-sitter grammars are compiled in. All 42 languages are enabled by default. Language features can be toggled:
# Only Rust and Python support
License
MIT
About
Built and maintained by Barnett Studios — building products, teams, and systems that last. Part-time technical leadership for startups and scale-ups.