Crate stopwords [−] [src]
This library provides stopwords datasets from popular text processing engines.
This could help reproducing results of text analysis pipelines written using different languages and tools.
Usage
[dependencies]
stopwords = "0.1.0"
extern crate stopwords; use std::collections::HashSet; use stopwords::{Spark, Language, Stopwords}; fn main() { let stops: HashSet<_> = Spark::stopwords(Language::English).unwrap().iter().collect(); let mut tokens = vec!("brocolli", "is", "good", "to", "eat"); tokens.retain(|s| !stops.contains(s)); assert_eq!(tokens, vec!("brocolli", "good", "eat")); }
Structs
LanguageError |
Language parse error. |
NLTK |
Data from NLTK - Python natural language toolkit. |
SkLearn |
Data from scikit-learn - Python machine learning library. |
Spark |
Data from Apache Spark - Scala engine for large-scale data processing. |
Enums
Language |
Supported languages. Each provider supports only a subset of this list. |
Traits
Stopwords |
Interface for getting stopwords from different providers. |