Crate tiktoken_stream

Expand description

§tiktoken-stream

Streaming token counter for partial LLM responses.

A streaming LLM response arrives as a sequence of small text chunks (stream=true SSE deltas). For UX (progress bars, soft caps) you often want a running token count without holding the full text in memory and without re-tokenizing the entire prefix on every chunk.

This crate is a tiny counter:

Construct with a tokenizer function (fn(&str) -> u64).
Call TokenStream::push with each delta. The stream forwards the chunk through your tokenizer and bumps the running total.
Read TokenStream::count at any time.

The default estimator (4 chars per token, ceiling) is what char-token-est’s Family::Gpt uses; swap in tiktoken via the constructor when accuracy matters.

§Example

use tiktoken_stream::TokenStream;

let mut s = TokenStream::new();
s.push("Hello, ");
s.push("world!");
assert!(s.count() >= 1);

§Custom estimator

use tiktoken_stream::TokenStream;

// One token per whitespace-separated word.
let mut s = TokenStream::with_estimator(|chunk: &str| {
    chunk.split_whitespace().count() as u64
});
s.push("the quick brown");
s.push(" fox jumps");
assert_eq!(s.count(), 5);

Structs§

TokenStream: Streaming token counter.