Crate tiktoken_rs

Expand description

§`tiktoken-rs`

Rust library for tokenizing text with OpenAI models using tiktoken.

This library provides a set of ready-made tokenizer libraries for working with GPT, tiktoken and related OpenAI models. Use cases cover tokenizing and counting tokens in text inputs.

This library is built on top of the tiktoken library and includes some additional features and enhancements for ease of use with Rust code.

Supports all current OpenAI models including GPT-5.4, GPT-5, GPT-4.1, GPT-4o, o1, o3, o4-mini, and gpt-oss models.

Scope: This crate is focused on OpenAI tokenizers (tiktoken). For non-OpenAI models (Llama, Gemini, Mistral, etc.), use the HuggingFace tokenizers crate.

§Examples

For full working examples for all supported features, see the examples directory in the repository.

§Usage

Install this tool locally with cargo

cargo add tiktoken-rs

Then in your rust code, call the API

§Counting token length

use tiktoken_rs::o200k_base;

let bpe = o200k_base().unwrap();
let tokens = bpe.encode_with_special_tokens(
  "This is a sentence   with spaces"
);
println!("Token count: {}", tokens.len());

For repeated calls, use the singleton to avoid re-initializing the tokenizer:

use tiktoken_rs::o200k_base_singleton;

let bpe = o200k_base_singleton();
let tokens = bpe.encode_with_special_tokens(
  "This is a sentence   with spaces"
);
println!("Token count: {}", tokens.len());

§Counting max_tokens parameter for a chat completion request

use tiktoken_rs::{get_chat_completion_max_tokens, ChatCompletionRequestMessage};

let messages = vec![
    ChatCompletionRequestMessage {
        content: Some("You are a helpful assistant that only speaks French.".to_string()),
        role: "system".to_string(),
        ..Default::default()
    },
    ChatCompletionRequestMessage {
        content: Some("Hello, how are you?".to_string()),
        role: "user".to_string(),
        ..Default::default()
    },
    ChatCompletionRequestMessage {
        content: Some("Parlez-vous francais?".to_string()),
        role: "system".to_string(),
        ..Default::default()
    },
];
let max_tokens = get_chat_completion_max_tokens("o1-mini", &messages).unwrap();
println!("max_tokens: {}", max_tokens);

§Counting max_tokens parameter for a chat completion request with async-openai

Need to enable the async-openai feature in your Cargo.toml file.

use tiktoken_rs::async_openai::get_chat_completion_max_tokens;
use async_openai::types::chat::{
    ChatCompletionRequestMessage, ChatCompletionRequestSystemMessage,
    ChatCompletionRequestSystemMessageContent, ChatCompletionRequestUserMessage,
    ChatCompletionRequestUserMessageContent,
};

let messages = vec![
    ChatCompletionRequestMessage::System(ChatCompletionRequestSystemMessage {
        content: ChatCompletionRequestSystemMessageContent::Text(
            "You are a helpful assistant that only speaks French.".to_string(),
        ),
        name: None,
    }),
    ChatCompletionRequestMessage::User(ChatCompletionRequestUserMessage {
        content: ChatCompletionRequestUserMessageContent::Text(
            "Hello, how are you?".to_string(),
        ),
        name: None,
    }),
];
let max_tokens = get_chat_completion_max_tokens("o1-mini", &messages).unwrap();
println!("max_tokens: {}", max_tokens);

tiktoken supports these encodings used by OpenAI models:

Encoding name	OpenAI models
`o200k_harmony`	`gpt-oss-20b`, `gpt-oss-120b`
`o200k_base`	GPT-5 series, `o1`/`o3`/`o4` series, `gpt-4o`, `gpt-4.5`, `gpt-4.1`, `codex-*`
`cl100k_base`	`gpt-4`, `gpt-3.5-turbo`, `text-embedding-ada-002`, `text-embedding-3-*`
`p50k_base`	Code models, `text-davinci-002`, `text-davinci-003`
`p50k_edit`	Edit models like `text-davinci-edit-001`, `code-davinci-edit-001`
`r50k_base` (or `gpt2`)	GPT-3 models like `davinci`

§Context sizes

Model	Context window
`gpt-5.4`, `gpt-5.4-pro`	1,050,000
`gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`	1,047,576
`gpt-5`, `gpt-5-mini`, `gpt-5-nano`, `gpt-5.4-mini`, `gpt-5.4-nano`	400,000
`o1`, `o3`, `o3-mini`, `o3-pro`, `o4-mini`	200,000
`codex-mini`	200,000
`gpt-oss`	131,072
`gpt-4o`, `gpt-4o-mini`	128,000
`o1-mini`, `gpt-5.3-codex-spark`	128,000
`gpt-3.5-turbo`	16,385
`gpt-4`	8,192

See the examples in the repo for use cases. For more context on the different tokenizers, see the OpenAI Cookbook

§Encountered any bugs?

If you encounter any bugs or have any suggestions for improvements, please open an issue on the repository.

§Acknowledgements

Thanks @spolu for the original code, and .tiktoken files.

§License

This project is licensed under the MIT License.

Modules§

model: contains information about OpenAI models.
tokenizer: lists out the available tokenizers for different OpenAI models.

Structs§

ChatCompletionRequestMessage
CoreBPE
DecodeKeyError
FunctionCall: The name and arguments of a function that should be called, as generated by the model.

Constants§

ENDOFPROMPT
ENDOFTEXT
FIM_MIDDLE
FIM_PREFIX
FIM_SUFFIX
O200K_BASE_PAT_STR

Traits§

FromRank: Lossless conversion from Rank (u32) to a wider integer type.

Functions§

bpe_for_model: Returns a cached reference to the BPE tokenizer for the given model name.
bpe_for_tokenizer: Returns a cached reference to the BPE tokenizer for the given tokenizer type.
byte_pair_split
cl100k_base: Use for ChatGPT models, text-embedding-ada-002 Initializes and returns a new instance of the cl100k_base tokenizer.
cl100k_base_singleton: Returns a singleton instance of the cl100k_base tokenizer. Use for ChatGPT models, text-embedding-ada-002
get_bpe_from_modelDeprecated: Use bpe_for_model instead.
get_bpe_from_tokenizerDeprecated: Use bpe_for_tokenizer instead.
get_chat_completion_max_tokens: Calculates the maximum number of tokens available for chat completion based on the model and messages provided.
get_completion_max_tokensDeprecated: Use get_text_completion_max_tokens instead.
get_text_completion_max_tokens: Returns the maximum number of tokens available for a text completion, given a model and prompt.
num_tokens_from_messages: Based on https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
o200k_base: Use for GPT-5, GPT-4.1, GPT-4o, and other o series models like o1, o3, and o4. Initializes and returns a new instance of the o200k_base tokenizer.
o200k_base_singleton: Returns a singleton instance of the o200k_base tokenizer. Use for GPT-5, GPT-4.1, GPT-4o, and other o series models like o1, o3, and o4.
o200k_harmony: Use for gpt-oss models like gpt-oss-20b, gpt-oss-120b. Initializes and returns a new instance of the o200k_harmony tokenizer.
o200k_harmony_singleton: Returns a singleton instance of the o200k_harmony tokenizer. Use for gpt-oss models like gpt-oss-20b, gpt-oss-120b.
p50k_base: Use for Code models, text-davinci-002, text-davinci-003 Initializes and returns a new instance of the p50k_base tokenizer.
p50k_base_singleton: Returns a singleton instance of the p50k_base tokenizer. Use for Code models, text-davinci-002, text-davinci-003
p50k_edit: Use for edit models like text-davinci-edit-001, code-davinci-edit-001 Initializes and returns a new instance of the p50k_base tokenizer.
p50k_edit_singleton: Returns a singleton instance of the p50k_edit tokenizer. Use for edit models like text-davinci-edit-001, code-davinci-edit-001
r50k_base: Use for GPT-3 models like davinci Initializes and returns a new instance of the r50k_base tokenizer (also known as gpt2)
r50k_base_singleton: Returns a singleton instance of the r50k_base tokenizer. (also known as gpt2) Use for GPT-3 models like davinci

Type Aliases§

Rank