pub fn count_tokens(text: &str) -> usizeExpand description
Returns the exact cl100k_base (OpenAI tiktoken) token count of text.
This is a deliberately conservative proxy for the
qwen/qwen3-embedding-8b tokenizer used by the OpenRouter embedding
backend: cl100k_base generally emits at least as many tokens as Qwen’s
BPE for the same input, so a count comfortably under the model’s
~32K-token effective ceiling guarantees the input fits Qwen’s window.
Unlike approx_tokens, this is exact for arbitrary input. It uses the
process-wide cached BPE singleton, so repeated calls do not re-initialise
the tokenizer.