flat
Pack an entire codebase into a single file, ready to paste into any AI.
|
That's it. .gitignore respected, secrets stripped, binaries skipped — automatically.
But the real power is fitting more code into a context window:
|
This compresses source code to its signatures (stripping function bodies, keeping structure) and packs files by priority until the token budget is full. README and entry points go in first. Test fixtures get cut first. Real tokenizer support (Claude, GPT-4, GPT-3.5) gives you accurate counts instead of heuristics.
Choose your output format:
|
Install
Or grab a prebuilt binary from Releases (macOS, Linux, Windows).
What You Get
$ flat src/ --include rs
<file path="src/tokens.rs">
pub fn estimate_tokens(content: &str, is_prose: bool) -> usize {
let byte_count = content.len();
if is_prose {
byte_count / 4
} else {
byte_count / 3
}
}
pub fn is_prose_extension(ext: &str) -> bool {
matches!(ext.to_lowercase().as_str(), "md" | "txt" | "rst" ...)
}
</file>
$ flat src/ --compress --include rs
<file path="src/tokens.rs" mode="compressed">
pub fn estimate_tokens(content: &str, is_prose: bool) -> usize { ... }
pub fn is_prose_extension(ext: &str) -> bool { ... }
</file>
Same file. Same API surface. 60% fewer tokens.
The Three Powers
flat has three features that compose together. Each is useful alone. Combined, they let you fit any codebase into any context window.
1. --compress — structural compression
Uses tree-sitter to parse source files across 12 languages, keep the structure, strip the implementation:
Kept Stripped
───────────────────────────── ──────────────────────
imports, require(), use function/method bodies
type definitions, interfaces loop contents
struct/class declarations if/else branches
function signatures variable assignments
decorators, attributes inside functions
docstrings, comments
module-level constants
enums, preprocessor directives
Supported languages: Rust, TypeScript/JavaScript (JSX/TSX), Python, Go, Java, C#, C, C++, Ruby, PHP, Solidity, Elixir.
| Language | Keeps | Body placeholder |
|---|---|---|
| Rust | use/mod/extern crate, attributes, macros, structs, enums, trait/impl signatures |
{ ... } |
| TS/JS (JSX/TSX) | imports, interfaces, type aliases, enums, class member signatures, exports | { ... } |
| Python | imports, docstrings, decorators, class variables, module constants | ... |
| Go | package, imports, type/const/var declarations |
{ ... } |
| Java | package, imports, class/interface/enum declarations, fields, constants |
{ ... } |
| C# | using, namespaces, class/struct/record/interface, properties, events |
{ ... } |
| C | #include/#define/preprocessor, typedefs, struct/enum/union |
{ ... } |
| C++ | preprocessor, templates, namespaces, classes with members, using/aliases |
{ ... } |
| Ruby | require, assignments, class/module structure |
...\nend |
| PHP | <?php, use/namespace, class/interface/trait/enum, properties |
{ ... } |
| Solidity | pragma, imports, contract/interface/library, event/error/struct/enum declarations |
{ ... } |
| Elixir | defmodule, use/import/alias/require, module attributes, typespecs |
...\nend |
Files in other languages pass through in full — nothing is silently dropped. If tree-sitter can't parse a file (syntax errors, unsupported features), the original is included with a stderr warning.
Real-world results:
| Codebase | Files | Full | Compressed | Reduction |
|---|---|---|---|---|
| Express | 6 | 61 KB | 28 KB | 54% |
| Flask | 24 | 339 KB | 214 KB | 37% |
Next.js packages/next/src |
1,605 | 8.0 MB | 5.6 MB | 31% |
2. --tokens N — token budget
Caps output to fit a context window. Files are scored by importance and packed greedily — high-value files first, low-value files dropped:
| Priority | Score | Examples |
|---|---|---|
| README | 100 | README.md, README.rst |
| Entry points | 90 | main.rs, index.ts, app.py |
| Config | 80 | Cargo.toml, package.json, tsconfig.json |
| Source | 70* | handler.rs, utils.ts (decreases with nesting depth) |
| Tests | 30 | *_test.go, test_*.py |
| Fixtures | 5 | tests/fixtures/*, __snapshots__/* |
--tokenizer NAME — accurate token counting
By default, token counts use a fast heuristic (bytes/3 for code, bytes/4 for prose). For precise budget allocation, use a real tokenizer:
| Tokenizer | Speed | Accuracy | When to use |
|---|---|---|---|
heuristic |
Instant | Conservative (~20-30% overestimate) | Quick previews, --stats |
claude |
~1s | Exact for Claude models | Pasting into Claude |
gpt-4 |
~1s | Exact for GPT-4 | OpenAI API calls |
gpt-3.5 |
~1s | Exact for GPT-3.5 | OpenAI API calls |
The heuristic intentionally overestimates so you stay within context windows. Real tokenizers give exact counts when precision matters.
3. --full-match GLOB — selective full content
When compressing, keep specific files in full:
app.py gets mode="full" with complete source. Everything else gets mode="compressed" with signatures only. Useful when you want a project overview but need complete implementation detail in the file you're debugging.
Composing Flags
Every combination works. Flags operate in a pipeline — filters narrow the file set, transforms shape the content, output controls the format:
Filters (narrow files) Transforms (shape content) Output
───────────────────── ────────────────────────── ──────
--include / --exclude --compress (stdout)
--match --full-match -o FILE
--max-size --tokens --dry-run
--gitignore --tokenizer --stats
--format
All filters compose with all transforms and all output modes. Here's what each transform combination does:
flat Full content
flat --compress Signatures only
flat --tokens 8000 Full content, capped to budget
flat --compress --tokens 8000 Signatures, capped to budget
flat --compress --full-match '*.rs' Matched files full, rest compressed
flat --compress --full-match '*.rs' \
--tokens 8000 The full pipeline (see below)
The full pipeline
Here's what happens:
- Filter — walk
src/, keep only.pyfiles - Score — rank every file by importance (README=100, entry points=90, ...)
- Allocate —
app.pymatches--full-match, so reserve its full content first - Fill — pack remaining files in priority order, compressing each to save space
- Cut — when the 30k token budget is full, exclude the rest
Preview the result without generating output:
$ flat src/ --include py --compress --full-match 'app.py' --tokens 30000 --dry-run
flask/app.py [FULL]
flask/config.py [COMPRESSED]
flask/__init__.py [COMPRESSED]
flask/blueprints.py [COMPRESSED]
flask/cli.py [EXCLUDED]
flask/ctx.py [EXCLUDED]
...
Token budget: 29.8k / 30.0k used
Excluded by budget: 16 files
app.py is in full (you can debug it). The most important modules are compressed (you can see the API surface). Low-priority files are cut. Everything fits in 30k tokens.
What --full-match does NOT do
--full-match does not override the token budget. If app.py is 20k tokens and your budget is 10k, app.py gets excluded — the budget is a hard ceiling. This is intentional: if flat silently overran the budget, you'd overflow context windows.
Filtering
Numeric arguments accept single-letter suffixes: k/K (thousands), M (millions/mebibytes), G (billions/gibibytes).
Filters compose: --include/--exclude operate on extensions, --match operates on filenames. They all apply before compression and budget allocation.
Output Modes
| Flag | Output |
|---|---|
| (none) | XML-wrapped file contents to stdout |
--format markdown |
Markdown with fenced code blocks |
-o FILE |
Write to a file instead of stdout |
--dry-run |
File list only, no content |
--stats |
Summary statistics only |
--dry-run + --tokens |
File list annotated [FULL] / [COMPRESSED] / [EXCLUDED] |
--format — output format
|
XML (default) wraps each file in <file path="..."> tags. Best for programmatic parsing and when tools expect structured XML.
Markdown renders each file with a heading and a fenced code block with syntax highlighting. Best for pasting into chat interfaces (Claude, ChatGPT) or human reading.
Performance
The entire Next.js monorepo — 25,000+ files — processes in under 3 seconds:
$ time flat /path/to/nextjs --compress --stats
Included: 24,327
Compressed: 19,771 files
Skipped: 894
real 0m2.883s
Without --tokens, compression streams file-by-file (constant memory). With --tokens, all candidate files are buffered for scoring — but even that is fast.
Safety
Secrets are always excluded — no flag needed:
| Pattern | Examples |
|---|---|
| Environment | .env, .env.local, .env.production |
| Keys | *.key, *.pem, *.p12, *.pfx |
| SSH | id_rsa, id_dsa, id_ecdsa, id_ed25519 |
| Credentials | credentials.json, serviceAccount.json |
Binary files are always excluded (images, media, archives, executables, compiled artifacts). All .gitignore patterns are respected via ripgrep's parser.
Use
--dry-runto preview before sharing code with any external service.
Recipes
# The basics
| |
# Output formats
|
# Compression
| |
# Token budgets
| | |
# Targeted
|
# The full pipeline
|
# Save to file
Project
src/
├── main.rs CLI entry point
├── walker.rs Directory traversal, two-pass budget allocation
├── compress.rs Tree-sitter compression engine (12 languages)
├── priority.rs File importance scoring
├── tokens.rs Token estimation and real tokenizer support
├── filters.rs Secret and binary detection
├── output.rs Output formatting and statistics
├── parse.rs Number parsing (k/M/G suffixes)
├── config.rs Configuration
└── lib.rs Public API
184+ tests, validated against Flask, FastAPI, Express, and Next.js.
&&
License
MIT — see LICENSE.