Module wana_kana::tokenize
[−]
[src]
Splits input into array of Kanji, Hiragana, Katakana, and Romaji tokens.
Does not split into parts of speech!
Example
use wana_kana::tokenize::*; use wana_kana::Options; let empty: Vec<String> = vec![]; assert_eq!(tokenize(""), empty); assert_eq!(tokenize("ふふフフ"), vec!["ふふ", "フフ"]); assert_eq!(tokenize("感じ"), vec!["感", "じ"]); assert_eq!(tokenize("私は悲しい"), vec!["私", "は", "悲", "しい"] ); assert_eq!(tokenize("what the...私は「悲しい」。"), vec!["what the...", "私", "は", "「", "悲", "しい", "」。", ] );
Functions
tokenize |