Skip to main content

Tokenizer

Trait Tokenizer 

Source
pub trait Tokenizer: Send + Sync {
    // Required method
    fn count_text(&self, text: &str) -> usize;

    // Provided methods
    fn count_message(&self, message: &Message) -> usize { ... }
    fn count_messages(&self, messages: &[Message]) -> usize { ... }
}
Expand description

Counts tokens in text and full messages.

Implementations only have to provide Self::count_text; the message-level helpers walk every block kind that contributes to what the model actually sees on the wire (text, tool calls and their JSON args, tool results, reasoning text and signatures, redacted reasoning payloads).

Implementations must be Send + Sync because consumers wrap them behind Arc<dyn Tokenizer> and use them across await boundaries (the engine, conversation history, and middlewares are all multi- task by design).

Required Methods§

Source

fn count_text(&self, text: &str) -> usize

Count tokens in a flat string. The only required method.

Provided Methods§

Source

fn count_message(&self, message: &Message) -> usize

Count tokens in a single Message, walking every block kind the provider sees. Defaults to summing block-level Self::count_text calls; override if your tokenizer has a cheaper batch path.

Source

fn count_messages(&self, messages: &[Message]) -> usize

Count tokens in a slice of messages — the budget unit History::compact_if_needed measures against.

Implementors§