Function zoea::nlp::text_token_bigrams[][src]

pub fn text_token_bigrams(text: &str) -> Vec<String>
Expand description

text_token_bigrams

This method takes a slice of a string and produces a vector of bigrams as strings. Here is what happens under the hood:

  1. “funky” non-alphanumeric characters are removed
  2. everything is converted to lower case
  3. the string slice is split into words (split on whitespace)
  4. the “stem” of each word is taken using rust_stemmers
  5. a window of two “stems” moves along the list producing bigrams

EXAMPLE:

use zoea::nlp::text_token_bigrams;
let string_2 = String::from("I walked to San Diego slowly today!");
let bigrams_2 = text_token_bigrams(&string_2);
println!("Sentence = {}", string_2);
for gram in bigrams_2 {
   println!("bigram= {}", gram)
}