Expand description
sapient-tokenizers — HuggingFace-compatible tokenization.
Wraps the official HuggingFace tokenizers Rust crate, which supports:
- BPE (GPT-2, Llama, Falcon, Phi, Qwen)
- WordPiece (BERT, RoBERTa, DistilBERT)
- SentencePiece (T5, Gemma, Llama)
Also provides Jinja2 chat template rendering for chat models.
Re-exports§
pub use chat::ChatMessage;pub use chat::ChatRole;pub use chat::ChatTemplate;pub use tokenizer::SapientTokenizer;pub use tokenizer::TokenizerOptions;