Skip to main content

Tokenizer

Trait Tokenizer 

Source
pub trait Tokenizer: Send + Sync {
    // Required methods
    fn count_tokens(&self, text: &str) -> Result<usize>;
    fn get_model_info(&self) -> ModelInfo;

    // Provided method
    fn encode_with_details(
        &self,
        _text: &str,
    ) -> Result<Option<Vec<TokenDetail>>> { ... }
}
Expand description

Trait for tokenizing text with a specific model

Required Methods§

Source

fn count_tokens(&self, text: &str) -> Result<usize>

Count the number of tokens in the given text

Source

fn get_model_info(&self) -> ModelInfo

Get information about the model

Provided Methods§

Source

fn encode_with_details(&self, _text: &str) -> Result<Option<Vec<TokenDetail>>>

Encode text and return detailed token information (optional, for debug mode)

Returns None if the tokenizer doesn’t support detailed tokenization. This is used for debug output (-vvv flag).

Implementors§