Available on crate feature
tokenize
only.Expand description
Splits input into array of strings separated by opinionated
TokenType
.
tokenize_detailed
returns an
array containing { TokenType, String }
instead of String
Example
use wana_kana::tokenize::*;
let empty: Vec<String> = vec![];
assert_eq!(tokenize(""), empty);
assert_eq!(tokenize("ふふフフ"), vec!["ふふ", "フフ"]);
assert_eq!(tokenize("感じ"), vec!["感", "じ"]);
assert_eq!(tokenize("私は悲しい"), vec!["私", "は", "悲", "しい"] );
Enums
The tokenizer assigns each token a
TokenType
.Functions
Tokenizes the text and returns the token for each type.
Tokenizes the text. Splits input into array of strings separated by opinionated
TokenType
.