DefaultTokenizer

Type Alias DefaultTokenizer 

Source
pub type DefaultTokenizer = DefaultTokenizer;
Expand description

The default tokenizer is available via the default_tokenizer feature. It should fit most use-cases. It splits on whitespace and punctuation, removes stop words and stems the remaining words. It can also detect languages via the language_detection feature. This crate uses DefaultTokenizer as the default concrete type for things that are generic over a Tokenizer.

Aliased Typeยง

pub struct DefaultTokenizer { /* private fields */ }