pub fn load_tokenizer(
tokenizer_path: Option<&Path>,
vocab_size: usize,
) -> Result<BoxedTokenizer>Expand description
Load the best available tokenizer for Hydra.
Attempts to load in order:
- Llama 3 tokenizer from the specified path
- Fallback tokenizer with specified vocab size
§Arguments
tokenizer_path- Optional path totokenizer.jsonvocab_size- Fallback vocab size if no tokenizer found
§Example
ⓘ
let tokenizer = load_tokenizer(Some("./models/hydra/tokenizer.json"), 128000)?;