char-token-est 0.1.0

Tokenless byte/char-based token-count estimator for LLM prompts. Per-model-family calibration for Claude, GPT, Gemini, Llama. Zero deps.

Documentation

Coverage
100%
11 out of 11 items documented1 out of 6 items with examples
Size
Source code size: 20.82 kB This is the summed size of all the files inside the crates.io package for this release.
Documentation size: 335.81 kB This is the summed size of all files generated by rustdoc for all configured targets
Ø build duration
this release: 14s Average build duration of successful builds.
all releases: 14s Average build duration of successful builds in releases after 2024-10-23.
Links
Homepage
MukundaKatta/char-token-est
0 0 0
crates.io
Dependencies
Versions
- 0.1.0 (2026-05-16)
Owners

char-token-est

Tokenless token-count estimator for LLM prompts. ~10% accurate on typical prompts, fast, zero deps. Use when a real BPE tokenizer is too heavy (routing, budget gates, log lines, progress bars).

Usage

use char_token_est::{estimate, Family};

let n = estimate("The quick brown fox jumps over the lazy dog.", Family::Gpt);
println!("~{n} tokens");

Or supply your own ratio:

use char_token_est::estimate_with_ratio;
let n = estimate_with_ratio("...", 4.0);

Calibration

Family	chars/token
`Gpt`	4.0
`Claude`	3.5
`Gemini`	4.0
`Llama`	3.7
`Cohere`	3.8

Calibration is best-effort on English + code + JSON. Pure non-Latin input deviates further; use a real tokenizer for billing.

License

MIT or Apache-2.0.