[][src]Function zoea::nlp::text_tokens

pub fn text_tokens(text: &str) -> Vec<String>

text_tokens

This method takes a slice of a string and produces a vector of stemmed words Here is what happens under the hood:

  1. "funky" non-alphanumeric characters are removed
  2. everything is converted to lower case
  3. the string slice is split into words (split on whitespace)
  4. the "stem" of each word is taken using rust_stemmers

EXAMPLE:

use zoea::nlp::text_tokens;
let string_2 = String::from("I walked to San Diego slowly today!");
let tokens = text_tokens(&string_2);
println!("Sentence = {}", string_2);
for token in tokens {
    println!("bigram= {}", token)
}