Skip to main content

Module tokenize

Module tokenize 

Source
Expand description

HuggingFace tokenizer wrapper.

Downloads and caches the tokenizer.json from a HuggingFace model repository using hf-hub, then loads it for fast encoding.

Functions§

load_tokenizer
Load a tokenizer from a HuggingFace model repository.
tokenize_query
Tokenize a query string for embedding, truncating to model_max_tokens.