Module mako::tokenization[][src]

Expand description

Tokenization module handles all tokenization and untokenization

Structs

Functions

Tokenize the string into a vector of each letter

Tokenizes strings using BPE tokenization

Tokenizes strings by splitting at whitespace

Tokenizes strings using WordPiece tokenization

Untokenize alphabet tokens

Untokenize BPE tokens

Untokenize space seperated tokens

Untokenize wordpiece tokens

Loads the BPE tokenizer

Loads the wordpiece tokenizer

Tokenize the string into a vector of each letter

Tokenizes string using BPE tokenization

Tokenizes strings by splitting at whitespace

Tokenizes string using BPE tokenization

Untokenize alphabet tokens

Untokenize BPE tokens

Untokenize space seperated tokens

Untokenize wordpiece tokens