Context Builder

A blazing-fast CLI for creating LLM context from your entire codebase.

Tired of manually copy-pasting files into your LLM prompts? Context Builder automates this tedious process, creating a single, clean, and context-rich markdown file from any directory.

Why Context Builder?

Providing broad context to Large Language Models (LLMs) is key to getting high-quality, relevant responses. This tool was built to solve one problem exceptionally well: packaging your project's source code into a clean, LLM-friendly format with zero fuss.

It's a command-line utility that recursively processes directories and creates comprehensive markdown documentation, optimized for AI conversations.

Core Features

⚡ Blazing Fast & Parallel by Default:
Processes thousands of files in seconds by leveraging all available CPU cores.
🧠 Smart & Efficient File Discovery:
Respects .gitignore and custom ignore patterns out-of-the-box using optimized, parallel directory traversal.
💾 Memory-Efficient Streaming:
Handles massive files with ease by reading and writing line-by-line, keeping memory usage low.
🌳 Clear File Tree Visualization:
Generates an easy-to-read directory structure at the top of the output file.
🔍 Powerful Filtering & Preview:
Easily include only the file extensions you need and use the instant --preview mode to see what will be processed.
⚙️ Configuration-First:
Use a .context-builder.toml file to store your preferences for consistent, repeatable outputs.
🔁 Automatic Per-File Diffs:
When enabled, automatically generates a clean, noise-reduced diff showing what changed between snapshots.
✂️ Diff-Only Mode:
Output only the change summary and modified file diffs—no full file bodies—to minimize token usage.
🧪 Accurate Token Counting:
Get real tokenizer–based estimates with --token-count to plan your prompt budgets.

Installation

From crates.io (Recommended)

cargo install context-builder

From source

git clone https://github.com/igorls/context-builder.git
cd context-builder
cargo install --path .

Usage

Basic Usage

# Process current directory and create output.md
context-builder

# Process a specific directory
context-builder -d /path/to/project

# Specify an output file
context-builder -d /path/to/project -o documentation.md

Advanced Options

# Filter by file extensions (e.g., only Rust and TOML files)
context-builder -f rs -f toml

# Ignore specific folders/files by name
context-builder -i target -i node_modules -i .git

# Preview mode (shows the file tree without generating output)
context-builder --preview

# Token count mode (accurately count the total token count of the final document using a real tokenizer.)
context-builder --token-count

# Add line numbers to all code blocks
context-builder --line-numbers

# Combine multiple options for a powerful workflow
context-builder -d ./src -f rs -f toml -i tests --line-numbers -o rust_context.md

Configuration

For more complex projects, you can use a .context-builder.toml file in your project's root directory to store your preferences. This is great for ensuring consistent outputs and avoiding repetitive command-line flags.

Example `.context-builder.toml`

# Default output file name
output = "context.md"

# Default output folder
output_folder = "docs/context"

# Create timestamped versions of the output file (e.g., context_20250912123000.md)
timestamped_output = true

# Automatically compute per-file diffs against the previous timestamped snapshot
auto_diff = true

# Emit only change summary + modified file diffs (omit full file bodies)
# Set to true to greatly reduce token usage when you just need what's changed.
diff_only = false

# File extensions to include
filter = ["rs", "toml", "md"]

# Folders or file names to ignore
ignore = ["target", "node_modules", ".git"]

# Add line numbers to code blocks
line_numbers = true

Auto-diff

When using timestamped_output = true together with auto_diff = true, Context Builder compares the previous canonical snapshot to the newly generated one and produces:

A Change Summary (Added / Removed / Modified files)
A File Differences section containing only modified files (added & removed are summarized but not diffed)

If you also set diff_only = true (or pass --diff-only), the full “## Files” section is omitted to conserve tokens: you get just the header + tree, the Change Summary, and per-file diffs for modified files.

Note: Command-line arguments will always override the settings in the configuration file.

Command Line Options

-d, --input <PATH> - Directory path to process (default: current directory).
-o, --output <FILE> - Output file path (default: output.md).
-f, --filter <EXT> - File extensions to include (can be used multiple times).
-i, --ignore <NAME> - Folder or file names to ignore (can be used multiple times).
--preview - Preview mode: only show the file tree, don't generate output.
--token-count - Token count mode: accurately count the total token count of the final document using a real tokenizer.
--line-numbers - Add line numbers to code blocks in the output.
--diff-only - With --auto-diff + --timestamped-output, output only change summary + modified file diffs (omit full file bodies).
-h, --help - Show help information.
-V, --version - Show version information.

Token Counting

Context Builder uses the tiktoken-rs library to provide accurate token counts for OpenAI models. This ensures that the token count is as close as possible to the actual number of tokens that will be used by the model.

Documentation

DEVELOPMENT.md: For contributors. Covers setup, testing, linting, and release process.
BENCHMARKS.md: For performance enthusiasts. Details on running benchmarks and generating datasets.
CHANGELOG.md: A complete history of releases and changes.

Contributing

Contributions are welcome! Please see DEVELOPMENT.md for setup instructions and guidelines. For major changes, please open an issue first to discuss what you would like to change.

Changelog

See CHANGELOG.md for a complete history of releases and changes.

License

This project is licensed under the MIT License. See the LICENSE file for details.

context-builder 0.4.1