Skip to main content

Module text

Module text 

Source
Expand description

Python bindings for scirs2-text

This module provides Python bindings for text processing operations, including tokenization, vectorization, sentiment analysis, stemming, string similarity metrics, and text cleaning.

Structs§

PyCharacterTokenizer
Character tokenizer
PyCountVectorizer
Count vectorizer (bag-of-words)
PyLancasterStemmer
Lancaster stemmer
PyLexiconSentimentAnalyzer
Lexicon-based sentiment analyzer
PyNgramTokenizer
N-gram tokenizer
PyPorterStemmer
Porter stemmer
PyRegexTokenizer
Regex tokenizer
PySentenceTokenizer
Sentence tokenizer
PySnowballStemmer
Snowball stemmer
PyTfidfVectorizer
TF-IDF vectorizer
PyWhitespaceTokenizer
Whitespace tokenizer
PyWordTokenizer
Word tokenizer

Functions§

register_module
Python module registration