kitoken 0.11.0

Fast tokenizer for language models, supporting BPE, Unigram and WordPiece tokenization

kitoken

There is very little structured metadata to build this page from currently. You should check the main library docs, readme, or Cargo.toml in case the author documented the features in them.

This version has 20 feature flags, 13 of them enabled by default.

default

convert (default)

multiversion (default)

normalization (default)

regex-perf (default)

serialization (default)

std (default)

convert-detect (default)

convert-sentencepiece (default)

convert-tekken (default)

convert-tiktoken (default)

convert-tokenizers (default)

normalization-charsmap (default)

normalization-unicode (default)

all

regex-onig

regex-unicode

split

split-unicode-script

unstable

This feature flag does not enable additional features.

web