pith

Generate structured codebase context for LLM consumption.
What is pith?
Pith extracts the essential structure of a codebase into a format optimized for large language models. Rather than dumping raw source files, pith produces codemaps: AST-extracted declarations, imports, and signatures that capture what exists in your code without the implementation noise.
The output is designed around three goals:
- Token efficiency: Codemaps capture the shape of code (what functions exist, what they accept and return) using fewer tokens than full source
- Structural clarity: Organized sections (
<file_map>,<codemaps>,<selected_files>) help LLMs parse and reason about code structure - Selective disclosure: Include full source only for files you're actively working on; use codemaps for surrounding context
Pith is both a CLI tool and a Rust library.
Quick Start
# Generate context for current directory, copy to clipboard
|
# Then paste into Claude, ChatGPT, or your LLM of choice
Supported Languages
| Language | Extensions |
|---|---|
| Rust | .rs |
| TypeScript | .ts, .tsx |
| JavaScript | .js, .jsx, .mjs, .cjs |
| Python | .py, .pyi |
| Go | .go |
Extraction uses tree-sitter for accurate parsing.
CLI Usage
Commands
pith tree <PATH> # Display file tree with metadata
pith codemap <PATH> # Extract API signatures only
pith context <PATH> # Full context: tree + codemaps + selected files
pith tokens <PATH> # Count tokens for budget planning
pith languages # Show supported languages
Key Options
--select <PATTERN> # Include full source for matching files (glob)
--lang <LANG> # Filter to specific language(s)
--json # Output as JSON (for programmatic use)
--include-docs # Include doc comments in codemaps
--include-private # Include private/internal items
Example: Generate context with selected files
Example Output
src/
├── api/
│ ├── handlers.rs [rust, 245 lines, 6.2KB] *+
│ └── routes.rs [rust, 89 lines, 2.1KB] *+
├── db/
│ └── queries.rs [rust, 156 lines, 4.1KB] +
└── lib.rs [rust, 42 lines, 1.0KB] +
Legend: * = selected, + = has codemap
## src/db/queries.rs
### Imports
- use sqlx::{Pool, Postgres}
- use crate::models::{User, Post}
### Declarations
#### pub async fn get_user (pool: &Pool, id: i64) -> Result (lines 12-18)
#### pub async fn list_posts (pool: &Pool, limit: i32) -> Result> (lines 20-31)
---
## src/lib.rs
### Imports
- use api::{handlers, routes}
### Declarations
#### pub fn create_app () -> Router (lines 8-15)
--- src/api/handlers.rs (245 lines, 1,823 tokens) ---
// Full file content here...
--- src/api/routes.rs (89 lines, 672 tokens) ---
// Full file content here...
Total: 3,241 tokens
Component breakdown:
- File tree: 89 tokens
- Codemaps: 657 tokens
- Selected files: 2,495 tokens
Understanding the Output
<file_map>: Directory tree with metadata.*marks selected files (full content included),+marks files with codemaps.<codemaps>: Per-file API signatures extracted via tree-sitter. Shows imports, function signatures, struct definitions, etc. without implementation bodies.<selected_files>: Full source content for files matching--selectpatterns.<token_summary>: Token counts for budget planning against context limits.
Library Usage
use ;
let result = new
.languages
.include_docs
.build?;
println!;
println!;
Pith automatically respects .gitignore and detects binary/minified/generated files.
Limitations
- Language coverage: Currently supports Rust, TypeScript, JavaScript, Python, and Go. No C/C++, Java, Ruby, etc.
- Partial parsing: Syntactically invalid code may produce incomplete codemaps.
- No semantic analysis: Type resolution is not performed. Import paths are extracted as-is.
This is a young project. API may change between versions.
Contributing
Issues and pull requests welcome. Please run cargo test before submitting.
License
MIT