Function wana_kana::tokenize::tokenize_detailed
source · Available on crate feature
tokenize
only.Expand description
Tokenizes the text and returns the token for each type.
Example
use wana_kana::tokenize::*;
assert_eq!(
tokenize_detailed(
"5romaji here...!?漢字ひらがなカタ カナ4「SHIO」。! لنذهب",
true
),
vec![
(TokenType::Other, "5".to_string()),
(TokenType::En, "romaji here".to_string()),
(TokenType::Other, "...!?".to_string()),
(TokenType::Ja, "漢字ひらがなカタ カナ".to_string()),
(TokenType::Other, "4「".to_string()),
(TokenType::Ja, "SHIO".to_string()),
(TokenType::Other, "」。!".to_string()),
(TokenType::En, " ".to_string()),
(TokenType::Other, "لنذهب".to_string()),
]
);