sapient-tokenizers 0.2.9

HuggingFace-compatible tokenizers for SAPIENT — BPE, WordPiece, SentencePiece, chat templates
Documentation

sapient-tokenizers — HuggingFace-compatible tokenization.

Wraps the official HuggingFace tokenizers Rust crate, which supports:

  • BPE (GPT-2, Llama, Falcon, Phi, Qwen)
  • WordPiece (BERT, RoBERTa, DistilBERT)
  • SentencePiece (T5, Gemma, Llama)

Also provides Jinja2 chat template rendering for chat models.