Expand description
§tiktoken-stream
Streaming token counter for partial LLM responses.
A streaming LLM response arrives as a sequence of small text chunks
(stream=true SSE deltas). For UX (progress bars, soft caps) you
often want a running token count without holding the full text in
memory and without re-tokenizing the entire prefix on every chunk.
This crate is a tiny counter:
- Construct with a tokenizer function (
fn(&str) -> u64). - Call
TokenStream::pushwith each delta. The stream forwards the chunk through your tokenizer and bumps the running total. - Read
TokenStream::countat any time.
The default estimator (4 chars per token, ceiling) is what
char-token-est’s Family::Gpt uses; swap in tiktoken via the
constructor when accuracy matters.
§Example
use tiktoken_stream::TokenStream;
let mut s = TokenStream::new();
s.push("Hello, ");
s.push("world!");
assert!(s.count() >= 1);§Custom estimator
use tiktoken_stream::TokenStream;
// One token per whitespace-separated word.
let mut s = TokenStream::with_estimator(|chunk: &str| {
chunk.split_whitespace().count() as u64
});
s.push("the quick brown");
s.push(" fox jumps");
assert_eq!(s.count(), 5);Structs§
- Token
Stream - Streaming token counter.