[−][src]Function zoea::nlp::text_tokens
pub fn text_tokens(text: &str) -> Vec<String>
text_tokens
This method takes a slice of a string and produces a vector of stemmed words Here is what happens under the hood:
- "funky" non-alphanumeric characters are removed
- everything is converted to lower case
- the string slice is split into words (split on whitespace)
- the "stem" of each word is taken using rust_stemmers
EXAMPLE:
use zoea::nlp::text_tokens; let string_2 = String::from("I walked to San Diego slowly today!"); let tokens = text_tokens(&string_2); println!("Sentence = {}", string_2); for token in tokens { println!("bigram= {}", token) }