Expand description
§Ferrum Tokenizer
MVP tokenizer implementation for Ferrum inference stack.
This crate provides HuggingFace tokenizers integration and implements
the tokenizer interfaces defined in ferrum-interfaces.
§Features
- HuggingFace Integration: Load tokenizers from HF Hub or local files
- Incremental Decoding: Efficient token-by-token decoding for streaming
- Chat Templates: Support for conversation formatting (basic)
- Special Tokens: Proper handling of BOS, EOS, PAD tokens
Re-exports§
pub use implementations::*;
Modules§
Structs§
- Special
Tokens - Special tokens configuration
- TokenId
- Token identifier used across the inference pipeline.
- Tokenizer
Info - Tokenizer information and metadata
Enums§
- Tokenizer
Type - Tokenizer types/algorithms
Traits§
- Incremental
Tokenizer - Incremental tokenizer state for streaming
- Tokenizer
- Core tokenizer trait for encoding/decoding operations
- Tokenizer
Factory - Tokenizer factory for creating tokenizer instances
Functions§
- default_
factory - Default tokenizer factory using HuggingFace backend
- load_
from_ file - Load tokenizer from file
- load_
from_ hub - Load tokenizer from HuggingFace Hub
Type Aliases§
- Result
- Result type used throughout Ferrum