[−][src]Crate stop_words
About
Stop words are words that don't carry much meaning, and are typically removed as a preprocessing step before text analysis or natural language processing. This crate contains common stop words for a variety of languages. All stop word lists are from this resource.
This crate currently includes the following languages:
- Arabic
- Bulgarian
- Catalan
- Czech
- Danish
- Dutch
- English
- Finnish
- French
- German
- Hebrew
- Hindi
- Hungarian
- Indonesian
- Italian
- Norwegian
- Polish
- Portuguese
- Romanian
- Russian
- Slovak
- Spanish
- Swedish
- Turkish
- Ukrainian
- Vietnamese
Constants
LANGUAGES | Constant containing an array of available language names, spelled out |
LANGUAGES_ISO_693_1 | Constant containing an array of available language names, using ISO-693-1 codes |
LANGUAGES_ISO_693_2T | Constant containing an array of available language names, using ISO-693-2T codes |
Functions
get | The only function you'll ever need! Given a language code or name it returns common stop words as a |
vec_to_set | This function converts the standard |