Expand description
Tokenizer are in charge of chopping text into a stream of tokens ready for indexing. This is an seperate crate from tantivy, so implementors don’t need to update for each new tantivy version.
To add support for a tokenizer, implement the Tokenizer
trait.
Checkout the tantivy repo for some examples.
Structs§
- BoxToken
Filter - Simple wrapper of
Box<dyn TokenFilter + 'a>
. - BoxToken
Stream - Simple wrapper of
Box<dyn TokenStream + 'a>
. - Token
- Token
Traits§
- Token
Filter - Trait for the pluggable components of
Tokenizer
s. - Token
Filter Clone - Token
Stream TokenStream
is the result of the tokenization.- Tokenizer
Tokenizer
are in charge of splitting text into a stream of token before indexing.- Tokenizer
Clone