pub struct EdgeNgramTokenFilter { /* private fields */ }
Available on crate feature commons only.
Expand description

Token filter that produce ngram from the start of the token. For example, Quick will generate Q, Qu, Qui, Quic, …etc.

It is configure with two parameters:

  • min edge-ngram: the number of maximum characters (e.g. with min=3, Quick will generate Qui, Quic and Quick). It must be greater than 0.
  • max edge-ngram: the number of maximum characters (e.g. with max=3, Quick will generate Q, Qu and Qui. It is optional, and there is no maximum then it will generate up to the end of the token.

§Example

use std::num::NonZeroUsize;
use tantivy::tokenizer::{WhitespaceTokenizer, TextAnalyzer, Token};
use tantivy_analysis_contrib::commons::EdgeNgramTokenFilter;

let mut tmp = TextAnalyzer::builder(WhitespaceTokenizer::default())
   .filter(EdgeNgramTokenFilter::new(NonZeroUsize::new(2).unwrap(), NonZeroUsize::new(4), false)?)
   .build();
let mut token_stream = tmp.token_stream("Quick");

let token = token_stream.next().expect("A token should be present.");
assert_eq!(token.text, "Qu".to_string());
let token = token_stream.next().expect("A token should be present.");
assert_eq!(token.text, "Qui".to_string());
let token = token_stream.next().expect("A token should be present.");
assert_eq!(token.text, "Quic".to_string());

assert_eq!(None, token_stream.next());

This token filter is useful to do a “starts with” therefor a “search as you type”.

It is also easy to have an efficient “ends with” by adding the ReverseTokenFilter before the edge ngram filter.

§How to use it

To use it, you should have another pipeline at search time that does not include the edge-ngram filter. Otherwise, you’ll get irrelevant results. Please see the example in source repository for a way to do it.

Implementations§

source§

impl EdgeNgramTokenFilter

source

pub fn new( min: NonZeroUsize, max: Option<NonZeroUsize>, keep_original_token: bool ) -> Result<Self, EdgeNgramError>

Create a new EdgeNgramTokenFilter with the min and max ngram provided.

§Parameters
  • min : minimum edge-ngram.
  • max : maximum edge-ngram. It must be greater or equals to min. Provide None for unlimited.
  • keep_original_token: the complete token will also be output if the length is greater than max.

Trait Implementations§

source§

impl Clone for EdgeNgramTokenFilter

source§

fn clone(&self) -> EdgeNgramTokenFilter

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl Debug for EdgeNgramTokenFilter

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl From<NonZero<usize>> for EdgeNgramTokenFilter

source§

fn from(ngram: NonZeroUsize) -> Self

Converts to this type from the input type.
source§

impl Hash for EdgeNgramTokenFilter

source§

fn hash<__H: Hasher>(&self, state: &mut __H)

Feeds this value into the given Hasher. Read more
1.3.0 · source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
source§

impl Ord for EdgeNgramTokenFilter

source§

fn cmp(&self, other: &EdgeNgramTokenFilter) -> Ordering

This method returns an Ordering between self and other. Read more
1.21.0 · source§

fn max(self, other: Self) -> Self
where Self: Sized,

Compares and returns the maximum of two values. Read more
1.21.0 · source§

fn min(self, other: Self) -> Self
where Self: Sized,

Compares and returns the minimum of two values. Read more
1.50.0 · source§

fn clamp(self, min: Self, max: Self) -> Self
where Self: Sized + PartialOrd,

Restrict a value to a certain interval. Read more
source§

impl PartialEq for EdgeNgramTokenFilter

source§

fn eq(&self, other: &EdgeNgramTokenFilter) -> bool

This method tests for self and other values to be equal, and is used by ==.
1.0.0 · source§

fn ne(&self, other: &Rhs) -> bool

This method tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
source§

impl PartialOrd for EdgeNgramTokenFilter

source§

fn partial_cmp(&self, other: &EdgeNgramTokenFilter) -> Option<Ordering>

This method returns an ordering between self and other values if one exists. Read more
1.0.0 · source§

fn lt(&self, other: &Rhs) -> bool

This method tests less than (for self and other) and is used by the < operator. Read more
1.0.0 · source§

fn le(&self, other: &Rhs) -> bool

This method tests less than or equal to (for self and other) and is used by the <= operator. Read more
1.0.0 · source§

fn gt(&self, other: &Rhs) -> bool

This method tests greater than (for self and other) and is used by the > operator. Read more
1.0.0 · source§

fn ge(&self, other: &Rhs) -> bool

This method tests greater than or equal to (for self and other) and is used by the >= operator. Read more
source§

impl TokenFilter for EdgeNgramTokenFilter

§

type Tokenizer<T: Tokenizer> = EdgeNgramFilterWrapper<T>

The Tokenizer type returned by this filter, typically parametrized by the underlying Tokenizer.
source§

fn transform<T: Tokenizer>(self, tokenizer: T) -> Self::Tokenizer<T>

Wraps a Tokenizer and returns a new one.
source§

impl Copy for EdgeNgramTokenFilter

source§

impl Eq for EdgeNgramTokenFilter

source§

impl StructuralPartialEq for EdgeNgramTokenFilter

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> ToOwned for T
where T: Clone,

§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.