[][src]Trait tantivy::tokenizer::Tokenizer

pub trait Tokenizer<'a>: Sized + Clone {
    type TokenStreamImpl: TokenStream;
    fn token_stream(&self, text: &'a str) -> Self::TokenStreamImpl;

    fn filter<NewFilter>(
        self,
        new_filter: NewFilter
    ) -> ChainTokenizer<NewFilter, Self>
    where
        NewFilter: TokenFilter<Self::TokenStreamImpl>
, { ... } }

Tokenizer are in charge of splitting text into a stream of token before indexing.

See the module documentation for more detail.

Warning

This API may change to use associated types.

Associated Types

type TokenStreamImpl: TokenStream

Type associated to the resulting tokenstream tokenstream.

Loading content...

Required methods

fn token_stream(&self, text: &'a str) -> Self::TokenStreamImpl

Creates a token stream for a given str.

Loading content...

Provided methods

fn filter<NewFilter>(
    self,
    new_filter: NewFilter
) -> ChainTokenizer<NewFilter, Self> where
    NewFilter: TokenFilter<Self::TokenStreamImpl>, 

Appends a token filter to the current tokenizer.

The method consumes the current TokenStream and returns a new one.

Example


use tantivy::tokenizer::*;

let en_stem = SimpleTokenizer
    .filter(RemoveLongFilter::limit(40))
    .filter(LowerCaser)
    .filter(Stemmer::default());
Loading content...

Implementors

impl<'a> Tokenizer<'a> for FacetTokenizer[src]

type TokenStreamImpl = FacetTokenStream<'a>

impl<'a> Tokenizer<'a> for NgramTokenizer[src]

type TokenStreamImpl = NgramTokenStream<'a>

impl<'a> Tokenizer<'a> for RawTokenizer[src]

type TokenStreamImpl = RawTokenStream

impl<'a> Tokenizer<'a> for SimpleTokenizer[src]

type TokenStreamImpl = SimpleTokenStream<'a>

Loading content...