Skip to main content

Crate toklab_core

Crate toklab_core

Expand description

Pure-Rust core for toklab. Thin wrapper around tiktoken-rs that adds:

Bulk APIs (count_many, optional rayon parallelism) — the win over pure-Python tiktoken is in long lists where Python interpreter overhead dominates.
Length-budgeting helpers (fits, truncate_to) so the common patterns are one call instead of three.
Model-name lookup that maps OpenAI model names to encodings via tiktoken_rs::get_bpe_from_model.

Encodings supported out of the box: cl100k_base (GPT-3.5, GPT-4, text-embedding-3-*) and o200k_base (GPT-4o family).

Structs§

Tokenizer: Wraps a CoreBPE for one specific encoding.

Enums§

TokenizerError: All errors surfaced by toklab-core.

Type Aliases§

Result: Crate-wide result alias.