Skip to main content

Module tokenizer

Module tokenizer 

Source
Expand description

Tokenizer interface for text encoding/decoding

This module provides tokenizer abstractions that are completely separate from model implementations, supporting incremental decoding and various tokenization strategies.

Structs§

ChatMessage
Chat message for template application
PaddingConfig
Padding configuration
TokenizerConfig
Tokenizer configuration
TokenizerInfo
Tokenizer information and metadata
TokenizerStats
Tokenizer performance statistics
TruncationConfig
Truncation configuration

Enums§

PaddingDirection
Padding direction
PaddingStrategy
Padding strategies
TokenType
Token types for classification
TokenizerType
Tokenizer types/algorithms
TruncationStrategy
Truncation strategies

Traits§

AsyncTokenizer
Asynchronous tokenizer operations for I/O-bound tokenization
IncrementalTokenizer
Incremental tokenizer state for streaming
TextProcessor
Text processing utilities
Tokenizer
Core tokenizer trait for encoding/decoding operations
TokenizerCapabilities
Advanced tokenizer capabilities
TokenizerFactory
Tokenizer factory for creating tokenizer instances
TokenizerRegistry
Tokenizer registry for managing multiple tokenizers