Expand description
Text processing module for SciRS2
This module provides functionality for text processing, tokenization, vectorization, word embeddings, and other NLP-related operations.
Re-exports§
pub use distance::cosine_similarity;
pub use distance::jaccard_similarity;
pub use distance::levenshtein_distance;
pub use embeddings::Word2Vec;
pub use embeddings::Word2VecAlgorithm;
pub use embeddings::Word2VecConfig;
pub use error::Result;
pub use error::TextError;
pub use preprocess::BasicNormalizer;
pub use preprocess::BasicTextCleaner;
pub use preprocess::TextCleaner;
pub use preprocess::TextNormalizer;
pub use tokenize::CharacterTokenizer;
pub use tokenize::SentenceTokenizer;
pub use tokenize::Tokenizer;
pub use tokenize::WordTokenizer;
pub use vectorize::CountVectorizer;
pub use vectorize::TfidfVectorizer;
pub use vocabulary::Vocabulary;
Modules§
- distance
- Text distance and similarity measures
- embeddings
- Word embedding implementations
- error
- Error types for the text processing module
- preprocess
- Text preprocessing utilities
- tokenize
- Text tokenization utilities
- utils
- Utility functions for text processing
- vectorize
- Text vectorization utilities
- vocabulary
- Vocabulary management for text processing