<div align="center">
```
████████╗███████╗██████╗ ███████╗██╗███████╗██╗ ██╗
╚══██╔══╝██╔════╝██╔══██╗██╔════╝██║██╔════╝╚██╗ ██╔╝
██║ █████╗ ██████╔╝███████╗██║█████╗ ╚████╔╝
██║ ██╔══╝ ██╔══██╗╚════██║██║██╔══╝ ╚██╔╝
██║ ███████╗██║ ██║███████║██║██║ ██║
╚═╝ ╚══════╝╚═╝ ╚═╝╚══════╝╚═╝╚═╝ ╚═╝
```
**Token compression for LLM context windows**
[](https://crates.io/crates/tersify)
[](LICENSE)
[](https://github.com/rustkit-ai/tersify/actions)
</div>
---
Every file you send to an LLM is **30–50% noise**: comments, blank lines, `null` JSON fields, duplicate log lines. tersify strips all of it before the token counter starts.
```
$ tersify src/ --verbose
[tersify] 5 439 → 3 559 tokens (35% saved, 1 880 tokens freed)
```
**35% fewer tokens** in standard mode. **54% fewer** with `--ast` (signatures only).
No network calls. No configuration. Deterministic output.
---
## Install
**One-liner — installs the binary and hooks into all your AI editors automatically:**
```bash
**cargo**
```bash
cargo install tersify
tersify install --all # hook into Claude Code, Cursor, Windsurf (auto-detects)
```
**Pre-built binaries** — [github.com/rustkit-ai/tersify/releases](https://github.com/rustkit-ai/tersify/releases)
---
## How it works
### Standard mode — remove the noise
tersify applies language-aware rules to strip everything that carries no information for an LLM:
```rust
// Before — 384 tokens
// Authentication middleware for the REST API.
// Validates JWT tokens issued by our identity provider.
use anyhow::{Context, Result};
/// Claims embedded in the JWT token.
#[derive(Debug, Serialize, Deserialize)]
pub struct Claims {
pub sub: String, // subject — user id
pub exp: usize, // expiration timestamp
pub roles: Vec<String>, // authorisation roles
}
// Validates a bearer token and returns the embedded claims.
// Returns an error if the token is expired, malformed, or signed with the wrong key.
pub fn validate_token(token: &str, secret: &[u8]) -> Result<Claims> {
// Decode the header first to get the algorithm
let header = decode_header(token)
.context("failed to decode JWT header")?;
// Build a validation config matching the issuer requirements
let mut validation = Validation::new(header.alg);
validation.validate_exp = true; // always enforce expiry
let key = DecodingKey::from_secret(secret);
let data = decode::<Claims>(token, &key, &validation)
.context("JWT validation failed")?;
// Extra check: ensure the subject is non-empty
if data.claims.sub.is_empty() {
anyhow::bail!("token subject is empty");
}
Ok(data.claims)
}
```
```rust
// After — 228 tokens (41% saved)
use anyhow::{Context, Result};
/// Claims embedded in the JWT token.
#[derive(Debug, Serialize, Deserialize)]
pub struct Claims {
pub sub: String,
pub exp: usize,
pub roles: Vec<String>,
}
pub fn validate_token(token: &str, secret: &[u8]) -> Result<Claims> {
let header = decode_header(token)
.context("failed to decode JWT header")?;
let mut validation = Validation::new(header.alg);
validation.validate_exp = true;
let key = DecodingKey::from_secret(secret);
let data = decode::<Claims>(token, &key, &validation)
.context("JWT validation failed")?;
if data.claims.sub.is_empty() {
anyhow::bail!("token subject is empty");
}
Ok(data.claims)
}
```
### AST mode — signatures only
Pass `--ast` to go further: powered by [tree-sitter](https://tree-sitter.github.io/), tersify parses the full syntax tree and replaces every function body with a stub. The result is a precise API surface.
```rust
// tersify src/auth.rs --ast → 209 tokens (46% saved)
use anyhow::{Context, Result};
#[derive(Debug, Serialize, Deserialize)]
pub struct Claims {
pub sub: String,
pub exp: usize,
pub roles: Vec<String>,
}
pub fn validate_token(token: &str, secret: &[u8]) -> Result<Claims> { /* ... */ }
pub fn bearer_header(token: &str) -> String { /* ... */ }
```
Use AST mode when you want Claude to understand a project's structure without reading every implementation.
---
## Quick start
```bash
tersify src/main.rs # single file → stdout
tersify src/ # entire directory
tersify src/ --ast # signatures only
tersify src/ --strip-docs # also remove doc comments (///, /** */)
tersify token-cost src/ # estimate API cost
```
---
## Integrate with your AI editor
### All editors at once (recommended)
```bash
tersify install --all
```
Auto-detects Claude Code, Cursor, and Windsurf on your machine and hooks into all of them in one go.
```bash
tersify uninstall --all # remove all hooks
```
### Individual editors
```bash
tersify install # Claude Code only
tersify install --cursor # Cursor only
tersify install --windsurf # Windsurf only
```
### Claude Code — how it works
tersify registers a `PreToolUse` hook in `~/.claude/hooks.json`. Every file Claude reads is automatically compressed before it enters the context window — no workflow changes required.
---
## Benchmark
Real numbers — run `tersify bench` to reproduce locally.
### Standard mode
| Rust | 384 | 228 | **41%** |
| Python | 524 | 289 | **45%** |
| TypeScript | 528 | 369 | **30%** |
| Ruby | 447 | 285 | **36%** |
| Java | 608 | 435 | **28%** |
| C / C++ | 579 | 342 | **41%** |
| Swift | 699 | 597 | **15%** |
| Kotlin | 604 | 336 | **44%** |
| JSON | 181 | 103 | **43%** |
| Git diff | 275 | 213 | **23%** |
| Logs | 340 | 173 | **49%** |
| Plain text | 270 | 189 | **30%** |
| **Total** | **5 439** | **3 559** | **35%** |
### AST mode (`--ast`)
| Python | 524 | 162 | **69%** |
| Java | 608 | 265 | **56%** |
| Ruby | 447 | 212 | **53%** |
| TypeScript | 528 | 265 | **50%** |
| C / C++ | 579 | 304 | **47%** |
| Rust | 384 | 209 | **46%** |
| **Total** | **3 070** | **1 417** | **54%** |
---
## What gets removed
| **Code** | Inline and block comments, consecutive blank lines |
| **Code + `--strip-docs`** | Also removes `///`, `//!`, `/** */`, Python docstrings |
| **JSON** | `null` fields, empty `[]` and `{}` |
| **Logs** | Repeated identical lines → `[×47]` on the first |
| **Diffs** | Context lines — keeps only `+` / `-` and headers |
| **AST** | Function bodies → `{ /* ... */ }` (exact parse tree, tree-sitter) |
Doc comments (`///`, `//!`, `/** */`) are preserved by default — pass `--strip-docs` to remove them too.
---
## Supported languages
| Rust | ✓ | ✓ | `.rs` |
| Python | ✓ | ✓ | `.py` |
| TypeScript | ✓ | ✓ | `.ts` |
| TSX | ✓ | ✓ | `.tsx` |
| JavaScript | ✓ | ✓ | `.js` `.jsx` `.mjs` |
| Go | ✓ | ✓ | `.go` |
| Java | ✓ | ✓ | `.java` |
| Ruby | ✓ | ✓ | `.rb` |
| C / C++ | ✓ | ✓ | `.c` `.cpp` `.h` `.hpp` |
| C# | ✓ | ✓ | `.cs` |
| PHP | ✓ | ✓ | `.php` |
| Swift | ✓ | — | `.swift` |
| Kotlin | ✓ | — | `.kt` |
| JSON | ✓ | — | `.json` |
| Logs | ✓ | — | `.log` |
| Git diffs | ✓ | — | `.diff` `.patch` |
---
## Cost estimator
```bash
tersify token-cost src/
tersify token-cost src/ --model claude
```
```
5 439 → 3 559 tokens (35% saved, 1 880 tokens freed)
Model Provider $/M tokens Raw cost Compressed Saved/call
────────────────────────────────────────────────────────────────────────────────────────────
claude-opus-4.6 Anthropic $15.00 $0.0816 $0.0534 -$0.0282
claude-sonnet-4.6 Anthropic $3.00 $0.0163 $0.0107 -$0.0056
claude-haiku-4.5 Anthropic $0.80 $0.0044 $0.0028 -$0.0014
gpt-4o OpenAI $5.00 $0.0272 $0.0178 -$0.0094
gemini-2.5-pro Google $1.25 $0.0068 $0.0044 -$0.0023
deepseek-v3 DeepSeek $0.27 $0.0015 $0.0010 -$0.0005
────────────────────────────────────────────────────────────────────────────────────────────
At 100 calls/day with claude-opus-4.6: saves $2.82/day → $84.60/month
```
---
## MCP server
tersify ships a built-in [MCP](https://modelcontextprotocol.io/) server for agent pipelines.
```bash
# Add to Claude Code
claude mcp add tersify -- tersify mcp
```
Three tools exposed: `compress`, `count_tokens`, `estimate_cost`.
---
## Use as a library
```toml
[dependencies]
tersify = "0.3"
```
```rust
use tersify::{compress, detect};
use std::path::Path;
// Auto-detect and compress
let src = std::fs::read_to_string("src/main.rs")?;
let ct = detect::detect_for_path(Path::new("src/main.rs"), &src);
let out = compress::compress(&src, &ct, None)?;
// AST mode
use tersify::compress::{compress_with, CompressOptions};
let out = compress_with(&src, &ct, &CompressOptions { ast: true, ..Default::default() })?;
```
---
## CLI reference
```
tersify [FILES|DIRS] Compress to stdout
--ast Signatures only (tree-sitter)
--strip-docs Also remove doc comments (///, //!, /** */)
--type <lang> Force content type
--verbose Show token savings
tersify bench Benchmark all content types
tersify token-cost [FILES] Estimate LLM API cost
--model <filter> Filter models by name
tersify mcp Start MCP server
tersify install Hook into Claude Code
tersify install --cursor Hook into Cursor
tersify install --windsurf Hook into Windsurf
tersify uninstall Remove the hook
tersify completions <shell> Generate shell completions
```
`--type` values: `rust` `python` `javascript` `typescript` `tsx` `go` `ruby` `java` `c` `cpp` `csharp` `php` `swift` `kotlin` `json` `logs` `diff` `text`
---
## Development
```bash
git clone https://github.com/rustkit-ai/tersify
cd tersify
cargo test # 90 tests
cargo run -- bench # live benchmark
```
---
<div align="center">
MIT — made by [rustkit-ai](https://github.com/rustkit-ai)
[Contributing](CONTRIBUTING.md)
</div>