Crate tantivy_tokenizer_api

Source
Expand description

Tokenizer are in charge of chopping text into a stream of tokens ready for indexing. This is an separate crate from tantivy, so implementors don’t need to update for each new tantivy version.

To add support for a tokenizer, implement the Tokenizer trait. Checkout the tantivy repo for some examples.

Structs§

BoxTokenStream
Simple wrapper of Box<dyn TokenStream + 'a>.
Token
Token

Traits§

TokenFilter
Trait for the pluggable components of Tokenizers.
TokenStream
TokenStream is the result of the tokenization.
Tokenizer
Tokenizer are in charge of splitting text into a stream of token before indexing.