tokencount 0.1.0

Parallel CLI that counts GPT-style tokens across your codebase
# tokencount

`tokencount` is a blazing-fast CLI for counting GPT-style tokens across your project. It walks your tree in parallel, respects `.gitignore`, and reports per-file counts with totals and percentiles so you can plan prompts, costs, and chunking strategies.

## Features

- Parallel file scanning using Rayon
- Ignore handling powered by the `ignore` crate with `.gitignore` respected by default
- UTF-8 safe token counting via [`tiktoken-rs`]https://crates.io/crates/tiktoken-rs
- Flexible filtering: include extensions, exclude globs, follow symlinks, size limits
- Multiple output formats: table, JSON, NDJSON streaming
- Summary statistics with totals, averages, and P50/P90/P99 percentiles

## Installation

```bash
cargo install tokencount
```

Or install from source:

```bash
cargo install --path .
```

## Usage

```bash
# default: scan current directory, include only *.elm
 tokencount

# scan a project and include Elm + TypeScript files
 tokencount ./frontend --include-ext elm --include-ext ts

# return the top 10 files by token count
 tokencount --top 10

# emit JSON summary (great for CI)
 tokencount --format json > tokens.json

# stream NDJSON for downstream processing
 tokencount --format ndjson

# sort by token count descending
 tokencount --sort tokens
```

### CLI Options

Run `tokencount --help` for the full list of flags, including:

- `--include-ext` / `--exclude`
- `--max-bytes`
- `--encoding cl100k_base|o200k_base`
- `--format table|json|ndjson`
- `--top N`
- `--sort path|tokens`
- `--threads N`
- `--follow-symlinks`
- `--no-respect-gitignore`
- `-v/--verbose`, `-q/--quiet`

## Development

```bash
cargo fmt
cargo clippy --all-targets --all-features -- -D warnings
cargo test
```

The project is dual-licensed under MIT or Apache-2.0.

## License

Licensed under either of

- Apache License, Version 2.0 ([LICENSE-APACHE]LICENSE-APACHE or <http://www.apache.org/licenses/LICENSE-2.0>)
- MIT license ([LICENSE-MIT]LICENSE-MIT or <http://opensource.org/licenses/MIT>)

at your option.

## Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in `tokencount` by you shall be dual licensed as above, without any additional terms or conditions.