Function zoea::nlp::text_token_bigrams [−][src]
pub fn text_token_bigrams(text: &str) -> Vec<String>
Expand description
text_token_bigrams
This method takes a slice of a string and produces a vector of bigrams as strings. Here is what happens under the hood:
- “funky” non-alphanumeric characters are removed
- everything is converted to lower case
- the string slice is split into words (split on whitespace)
- the “stem” of each word is taken using rust_stemmers
- a window of two “stems” moves along the list producing bigrams
EXAMPLE:
use zoea::nlp::text_token_bigrams; let string_2 = String::from("I walked to San Diego slowly today!"); let bigrams_2 = text_token_bigrams(&string_2); println!("Sentence = {}", string_2); for gram in bigrams_2 { println!("bigram= {}", gram) }