Skip to main content

Crate toklab_core

Crate toklab_core 

Source
Expand description

Pure-Rust core for toklab. Thin wrapper around tiktoken-rs that adds:

  • Bulk APIs (count_many, optional rayon parallelism) — the win over pure-Python tiktoken is in long lists where Python interpreter overhead dominates.
  • Length-budgeting helpers (fits, truncate_to) so the common patterns are one call instead of three.
  • Model-name lookup that maps OpenAI model names to encodings via tiktoken_rs::get_bpe_from_model.

Encodings supported out of the box: cl100k_base (GPT-3.5, GPT-4, text-embedding-3-*) and o200k_base (GPT-4o family).

Structs§

Tokenizer
Wraps a CoreBPE for one specific encoding.

Enums§

TokenizerError
All errors surfaced by toklab-core.

Type Aliases§

Result
Crate-wide result alias.