token-count

A fast, accurate CLI tool for counting tokens in LLM model inputs

Overview

token-count is a POSIX-style command-line tool that counts tokens for various LLM models using exact tokenization. Pipe any text in, get accurate token counts out—no browser, no API calls, just a fast offline binary.

# Quick token count
echo "Hello world" | token-count --model gpt-4
2

# From file
token-count --model gpt-4 < document.txt
1842

# With context info
cat prompt.txt | token-count --model gpt-4 -v
Model: gpt-4 (cl100k_base)
Tokens: 142
Context window: 128000 tokens (0.1109% used)

Features

✅ Accurate - Exact tokenization using OpenAI's tiktoken library
✅ Fast - ~2.7µs for small inputs (3,700x faster than 10ms target)
✅ Efficient - 57MB memory for 12MB files (8.8x under 500MB limit)
✅ Compact - 9.2MB binary with all tokenizers embedded
✅ Offline - Zero runtime dependencies, all tokenizers built-in
✅ Simple - POSIX-style interface, works like wc or grep

Installation

Quick Install (Recommended)

Linux / macOS:

curl -sSfL https://raw.githubusercontent.com/shaunburdick/token-count/main/install.sh | bash

Homebrew (macOS / Linux):

brew install shaunburdick/tap/token-count

Cargo (All Platforms):

cargo install token-count

Manual Download:
Download pre-built binaries from GitHub Releases.

For detailed installation instructions, troubleshooting, and platform-specific guidance, see INSTALL.md.

System Requirements

Platform: Linux x86_64, macOS (Intel/Apple Silicon), Windows x86_64
Runtime: No dependencies (static binary)
Build from source: Rust 1.85.0 or later

Usage

Basic Usage

# Default model (gpt-3.5-turbo)
echo "Hello world" | token-count
2

# Specific model
echo "Hello world" | token-count --model gpt-4
2

# From file
token-count --model gpt-4 < input.txt
1842

# Piped from another command
cat README.md | token-count --model gpt-4o
3521

Model Selection

# Use canonical name
token-count --model gpt-4 < input.txt

# Use alias (case-insensitive)
token-count --model gpt4 < input.txt
token-count --model GPT-4 < input.txt

# With provider prefix
token-count --model openai/gpt-4 < input.txt

Verbosity Levels

# Simple output (default) - just the number
echo "test" | token-count
1

# Verbose (-v) - model info and context usage
echo "test" | token-count -v
Model: gpt-3.5-turbo (cl100k_base)
Tokens: 1
Context window: 16385 tokens (0.0061% used)

# Debug (-vvv) - for troubleshooting
echo "test" | token-count -vvv
Model: gpt-3.5-turbo (cl100k_base)
Tokens: 1
Context window: 16385 tokens

[Debug mode: Token IDs and decoding require tokenizer access]
[Full implementation in Phase 6]

Model Information

# List all supported models
token-count --list-models

# Output:
# Supported models:
#
#   gpt-3.5-turbo
#     Encoding: cl100k_base
#     Context window: 16385 tokens
#     Aliases: gpt-3.5, gpt35, gpt-35-turbo, openai/gpt-3.5-turbo
#
#   gpt-4
#     Encoding: cl100k_base
#     Context window: 128000 tokens
#     Aliases: gpt4, openai/gpt-4
# ...

Help and Version

# Show help
token-count --help

# Show version
token-count --version

Supported Models

OpenAI Models (Exact Tokenization)

Model	Encoding	Context Window	Aliases
gpt-3.5-turbo	cl100k_base	16,385	gpt-3.5, gpt35, gpt-35-turbo
gpt-4	cl100k_base	128,000	gpt4
gpt-4-turbo	cl100k_base	128,000	gpt4-turbo, gpt-4turbo
gpt-4o	o200k_base	128,000	gpt4o

All models support:

Case-insensitive names (e.g., GPT-4, gpt-4, Gpt-4)
Provider prefix (e.g., openai/gpt-4)

Error Handling

token-count provides helpful error messages with suggestions:

# Unknown model with fuzzy suggestions
$ echo "test" | token-count --model gpt5
Error: Unknown model: 'gpt5'. Did you mean: gpt-4, gpt-4o?

# Typo correction
$ echo "test" | token-count --model gpt4-tubro
Error: Unknown model: 'gpt4-tubro'. Did you mean: gpt-4-turbo?

# Invalid UTF-8
$ token-count < invalid.bin
Error: Input contains invalid UTF-8 at byte 0

Exit Codes

0 - Success
1 - I/O error or invalid UTF-8
2 - Unknown model name

Performance

Benchmarks

Measured on Ubuntu 22.04 with Rust 1.85.0:

Input Size	Time	Target	Result
100 bytes	2.7µs	<10ms	3,700x faster ⚡
1 KB	54µs	<100ms	1,850x faster ⚡
10 KB	534µs	N/A	Excellent

Memory Usage

12MB file: 57 MB resident memory (8.8x under 500MB limit)
Processing time: 0.76 seconds for 12MB
No memory leaks: Validated with valgrind

Binary Size

Release binary: 9.2 MB (5.4x under 50MB target)
Includes: All 4 OpenAI tokenizers embedded
Optimizations: Stripped, LTO enabled

Development

Building from Source

# Clone repository
git clone https://github.com/shaunburdick/token-count
cd token-count

# Run tests
cargo test

# Run benchmarks
cargo bench

# Build release binary
cargo build --release

# Check code quality
cargo clippy -- -D warnings
cargo fmt --check

# Security audit
cargo audit

Running Tests

# All tests (100 tests)
cargo test

# Specific test suite
cargo test --test model_aliases
cargo test --test verbosity
cargo test --test performance

# With output
cargo test -- --nocapture

Project Structure

token-count/
├── src/
│   ├── lib.rs              # Public library API
│   ├── main.rs             # Binary entry point
│   ├── cli/                # CLI argument parsing
│   │   ├── args.rs         # Clap definitions
│   │   ├── input.rs        # Stdin reading
│   │   └── mod.rs
│   ├── tokenizers/         # Tokenization engine
│   │   ├── openai.rs       # OpenAI tokenizer
│   │   ├── registry.rs     # Model registry
│   │   └── mod.rs
│   ├── output/             # Output formatters
│   │   ├── simple.rs       # Simple formatter
│   │   ├── verbose.rs      # Verbose formatter
│   │   ├── debug.rs        # Debug formatter
│   │   └── mod.rs
│   └── error.rs            # Error types
├── tests/                  # Integration tests
│   ├── fixtures/           # Test data
│   ├── model_aliases.rs
│   ├── verbosity.rs
│   ├── performance.rs
│   ├── error_handling.rs
│   ├── end_to_end.rs
│   └── ...
├── benches/                # Performance benchmarks
│   └── tokenization.rs
    └── .github/
        └── workflows/
            └── ci.yml          # CI configuration

Security

Resource Limits

Maximum input size: 100MB per invocation
Memory usage: Typically <100MB, peaks at ~2x input size
CPU usage: Single-threaded, 100% of one core during processing

Known Limitations

Stack Overflow with Highly Repetitive Inputs: The underlying tiktoken-rs library can experience stack overflow when processing highly repetitive single-character inputs (e.g., 1MB+ of the same character). This is due to regex backtracking in the tokenization engine. Real-world text with varied content works fine at large sizes.

Workaround: Break extremely large repetitive inputs into smaller chunks
Impact: Minimal - real documents rarely exhibit this pathological pattern
Status: Tracked upstream in tiktoken-rs

Best Practices

For CI/CD Pipelines:

# Limit concurrent processes to avoid resource exhaustion
ulimit -n 1024                    # Limit file descriptors
ulimit -v $((500 * 1024))        # Limit virtual memory to 500MB
echo "text" | token-count --model gpt-4

For Untrusted Input:

# Use timeout to prevent hangs
timeout 30s token-count --model gpt-4 < input.txt

For Large Files:

# Monitor memory usage
/usr/bin/time -v token-count --model gpt-4 < large-file.txt

Security Audit

Last audit: 2026-03-13
Findings: 0 critical, 0 high, 0 medium vulnerabilities
Dependencies: 5 direct, all audited with cargo audit
Binary: Stripped, no debug symbols, 9.2MB

Run security checks:

cargo audit                      # Check for known vulnerabilities
cargo clippy -- -D warnings     # Strict linting

Reporting Security Issues

If you discover a security vulnerability, please email hello@burdick.dev (or open a private security advisory on GitHub). Do not open public issues for security concerns.

Architecture

Design Principles

From our Constitution:

POSIX Simplicity - Behaves like standard Unix utilities
Accuracy Over Speed - Exact tokenization for supported models
Zero Runtime Dependencies - Single offline binary
Fail Fast with Clear Errors - No silent failures
Semantic Versioning - Predictable upgrade paths

Technical Stack

Language: Rust 1.85.0+ (stable)
CLI Parsing: clap 4.6.0+ (derive API)
Tokenization: tiktoken-rs 0.9.1+ (OpenAI models)
Error Handling: anyhow 1.0.102+, thiserror 1.0+
Fuzzy Matching: strsim 0.11+ (Levenshtein distance)
Testing: 100 tests with criterion benchmarks

Key Features

Library-first design: Core logic in lib.rs, thin binary wrapper
Trait-based abstractions: Extensible for future tokenizers
Strategy pattern: Multiple output formatters
Registry pattern: Model configuration with lazy initialization
Streaming support: 64KB chunks for large inputs

Roadmap

v0.1.0 (Current Release) ✅

OpenAI model support (4 models)
CLI with model selection and verbosity
Fuzzy model suggestions
UTF-8 validation with error reporting
Comprehensive test suite (100 tests)
Performance benchmarks
Cross-platform support (Linux, macOS, Windows)
Multiple installation methods (install.sh, Homebrew, cargo, manual)
GitHub release binaries with checksums
Automated release pipeline

v0.2.0 (Future - More Models)

Anthropic Claude support
Google Gemini support
Meta Llama support
Mistral support

v0.3.0 (Future - Stable API)

Stable library API for embedding
Token ID output (debug mode)
Batch processing mode
Configuration file support

Contributing

Contributions are welcome! This project follows specification-driven development.

Development Setup

See CONTRIBUTING.md for detailed instructions.

Quick start:

git clone https://github.com/shaunburdick/token-count
cd token-count
cargo test
cargo clippy

Code Quality Standards

No disabled lint rules - Fix code to comply, don't silence warnings
100% type safety - No any types or suppressions
All public APIs documented - With examples
Test coverage - All user stories covered
Zero clippy warnings - Strict linting enforced

License

MIT License - see LICENSE for details.

Acknowledgments

Built with:

tiktoken-rs - Rust tiktoken implementation
clap - Command line argument parser
spec-kit - Specification-driven development

Special thanks to:

OpenAI for open-sourcing tiktoken
The Rust community for excellent tooling

Status: ✅ MVP Complete (Linux) | Version: 0.1.0
Author: Shaun Burdick

token-count 0.1.0