Skip to main content

estimate_tokens

Function estimate_tokens 

Source
pub fn estimate_tokens(text: &str) -> usize
Expand description

Estimate token count for a text using tiktoken.

This uses the cl100k_base encoding which is used by:

  • GPT-4
  • GPT-3.5-turbo
  • GPT-4o
  • GPT-4o-mini
  • text-embedding-ada-002
  • text-embedding-3-small/large

ยงExample

use vectorless::domain::estimate_tokens;

assert_eq!(estimate_tokens(""), 0);
assert!(estimate_tokens("hello world") > 0);