Expand description
Pure-Rust core for toklab. Thin wrapper around
tiktoken-rs that adds:
- Bulk APIs (
count_many, optional rayon parallelism) — the win over pure-Pythontiktokenis in long lists where Python interpreter overhead dominates. - Length-budgeting helpers (
fits,truncate_to) so the common patterns are one call instead of three. - Model-name lookup that maps OpenAI model names to encodings via
tiktoken_rs::get_bpe_from_model.
Encodings supported out of the box: cl100k_base (GPT-3.5, GPT-4,
text-embedding-3-*) and o200k_base (GPT-4o family).
Structs§
- Tokenizer
- Wraps a
CoreBPEfor one specific encoding.
Enums§
- Tokenizer
Error - All errors surfaced by
toklab-core.
Type Aliases§
- Result
- Crate-wide result alias.