Expand description
High-performance pure-Rust BPE tokenizer compatible with OpenAI’s tiktoken.
Supports all tiktoken encodings (cl100k_base, o200k_base, p50k_base, p50k_edit,
r50k_base) with token encoding, decoding, counting, and multi-provider pricing.
§Quick Start
// by encoding name
let enc = tiktoken::get_encoding("cl100k_base").unwrap();
let tokens = enc.encode("hello world");
let text = enc.decode_to_string(&tokens).unwrap();
assert_eq!(text, "hello world");
// by model name
let enc = tiktoken::encoding_for_model("gpt-4o").unwrap();
let count = enc.count("hello world");
assert_eq!(count, 2);Modules§
- pricing
- Per-model pricing data and cost estimation for OpenAI, Anthropic Claude, and Google Gemini.
Structs§
- CoreBpe
- A Byte Pair Encoding tokenizer engine.
Functions§
- encoding_
for_ model - Get a cached tokenizer by OpenAI model name.
- get_
encoding - Get a cached tokenizer by encoding name.
- model_
to_ encoding - Map a model name to its encoding name.