[−][src]Trait tantivy::tokenizer::Tokenizer
Tokenizer
are in charge of splitting text into a stream of token
before indexing.
See the module documentation for more detail.
Warning
This API may change to use associated types.
Associated Types
type TokenStreamImpl: TokenStream
Type associated to the resulting tokenstream tokenstream.
Required methods
fn token_stream(&self, text: &'a str) -> Self::TokenStreamImpl
Creates a token stream for a given str
.
Provided methods
fn filter<NewFilter>(
self,
new_filter: NewFilter
) -> ChainTokenizer<NewFilter, Self> where
NewFilter: TokenFilter<Self::TokenStreamImpl>,
self,
new_filter: NewFilter
) -> ChainTokenizer<NewFilter, Self> where
NewFilter: TokenFilter<Self::TokenStreamImpl>,
Appends a token filter to the current tokenizer.
The method consumes the current TokenStream
and returns a
new one.
Example
use tantivy::tokenizer::*; let en_stem = SimpleTokenizer .filter(RemoveLongFilter::limit(40)) .filter(LowerCaser) .filter(Stemmer::default());