[−][src]Crate stopwords
This library provides stopwords datasets from popular text processing engines.
This could help reproducing results of text analysis pipelines written using different languages and tools.
Usage
[dependencies]
stopwords = "0.1.0"
extern crate stopwords; use std::collections::HashSet; use stopwords::{Spark, Language, Stopwords}; fn main() { let stops: HashSet<_> = Spark::stopwords(Language::English).unwrap().iter().collect(); let mut tokens = vec!("brocolli", "is", "good", "to", "eat"); tokens.retain(|s| !stops.contains(s)); assert_eq!(tokens, vec!("brocolli", "good", "eat")); }
Structs
LanguageError | Language parse error. |
NLTK | Data from NLTK - Python natural language toolkit. |
SkLearn | Data from scikit-learn - Python machine learning library. |
Spark | Data from Apache Spark - Scala engine for large-scale data processing. |
Enums
Language | Supported languages. Each provider supports only a subset of this list. |
Traits
Stopwords | Interface for getting stopwords from different providers. |