Skip to main content

Crate sapient_tokenizers

Crate sapient_tokenizers 

Source
Expand description

sapient-tokenizers — HuggingFace-compatible tokenization.

Wraps the official HuggingFace tokenizers Rust crate, which supports:

  • BPE (GPT-2, Llama, Falcon, Phi, Qwen)
  • WordPiece (BERT, RoBERTa, DistilBERT)
  • SentencePiece (T5, Gemma, Llama)

Also provides Jinja2 chat template rendering for chat models.

Re-exports§

pub use chat::ChatMessage;
pub use chat::ChatRole;
pub use chat::ChatTemplate;
pub use tokenizer::SapientTokenizer;
pub use tokenizer::TokenizerOptions;

Modules§

chat
Chat template rendering using Jinja2 via minijinja.
tokenizer
SapientTokenizer — wraps the HuggingFace tokenizers crate.