Struct nucleo_matcher::Matcher

source ·
pub struct Matcher {
    pub config: Config,
    /* private fields */
}
Expand description

A matcher engine that can execute (fuzzy) matches.

A matches contains heap allocated scratch memory that is reused during matching. This scratch memory allows the matcher to guarantee that it will never allocate during matching (with the exception of pushing to the indices vector if there isn’t enough capacity). However this scratch memory is fairly large (around 135KB) so creating a matcher is expensive.

All .._match functions will not compute the indices of the matched characters. These should be used to prefilter to filter and rank all matches. All .._indices functions will also compute the indices of the matched characters but are slower compared to the ..match variant. These should be used when rendering the best N matches. Note that the indices argument is never cleared. This allows running multiple different matches on the same haystack and merging the indices by sorting and deduplicating the vector.

The needle argument for each function must always be normalized by the caller (unicode normalization and case folding). Otherwise, the matcher may fail to produce a match. The pattern modules provides utilities to preprocess needles and should usually be preferred over invoking the matcher directly. Additionally it’s recommend to perform separate matches for each word in the needle. Consider the folloling example:

If foo bar is used as the needle it matches both foo test baaar and foo hello-world bar. However, foo test baaar will receive a higher score than foo hello-world bar. baaar contains a 2 character gap which will receive a penalty and therefore the user will likely expect it to rank lower. However, if foo bar is matched as a single query hello-world and test are both considered gaps too. As hello-world is a much longer gap then test the extra penalty for baaar is canceled out. If both words are matched individually the interspersed words do not receive a penalty and foo hello-world bar ranks higher.

In general nucleo is a substring matching tool (except for the prefix/ postfix matching modes) with no penalty assigned to matches that start later within the same pattern (which enables matching words individually as shown above). If patterns show a large variety in length and the syntax described above is not used it may be preferable to give preference to matches closer to the start of a haystack. To accommodate that usecase the prefer_prefix option can be set to true.

Matching is limited to 2^32-1 codepoints, if the haystack is longer than that the matcher will panic. The caller must decide whether it wants to filter out long haystacks or truncate them.

Fields§

§config: Config

Implementations§

source§

impl Matcher

source

pub fn new(config: Config) -> Self

Creates a new matcher instance, note that this will eagerly allocate a fairly large chunk of heap memory (around 135KB currently but subject to change) so matchers should be reused if called often (like in a loop).

source

pub fn fuzzy_match( &mut self, haystack: Utf32Str<'_>, needle: Utf32Str<'_> ) -> Option<u16>

Find the fuzzy match with the highest score in the haystack.

This functions has O(mn) time complexity for short inputs. To avoid slowdowns it automatically falls back to greedy matching for large needles and haystacks.

See the matcher documentation for more details.

source

pub fn fuzzy_indices( &mut self, haystack: Utf32Str<'_>, needle: Utf32Str<'_>, indices: &mut Vec<u32> ) -> Option<u16>

Find the fuzzy match with the higehest score in the haystack and compute its indices.

This functions has O(mn) time complexity for short inputs. To avoid slowdowns it automatically falls back to [greedy matching] (crate::Matcher::fuzzy_match_greedy) for large needles and haystacks

See the matcher documentation for more details.

source

pub fn fuzzy_match_greedy( &mut self, haystack: Utf32Str<'_>, needle: Utf32Str<'_> ) -> Option<u16>

Greedly find a fuzzy match in the haystack.

This functions has O(n) time complexity but may provide unintutive (non-optimal) indices and scores. Usually fuzzy_match should be preferred.

See the matcher documentation for more details.

source

pub fn fuzzy_indices_greedy( &mut self, haystack: Utf32Str<'_>, needle: Utf32Str<'_>, indices: &mut Vec<u32> ) -> Option<u16>

Greedly find a fuzzy match in the haystack and compute its indices.

This functions has O(n) time complexity but may provide unintuitive (non-optimal) indices and scores. Usually fuzzy_indices should be preferred.

See the matcher documentation for more details.

source

pub fn substring_match( &mut self, haystack: Utf32Str<'_>, needle_: Utf32Str<'_> ) -> Option<u16>

Finds the substring match with the highest score in the haystack.

This functions has O(nm) time complexity. However many cases can be significantly accelerated using prefilters so it’s usually very fast in practice.

See the matcher documentation for more details.

source

pub fn substring_indices( &mut self, haystack: Utf32Str<'_>, needle_: Utf32Str<'_>, indices: &mut Vec<u32> ) -> Option<u16>

Finds the substring match with the highest score in the haystack and compute its indices.

This functions has O(nm) time complexity. However many cases can be significantly accelerated using prefilters so it’s usually fast in practice.

See the matcher documentation for more details.

source

pub fn exact_match( &mut self, haystack: Utf32Str<'_>, needle: Utf32Str<'_> ) -> Option<u16>

Checks whether needle and haystack match exactly.

This functions has O(n) time complexity.

See the matcher documentation for more details.

source

pub fn exact_indices( &mut self, haystack: Utf32Str<'_>, needle: Utf32Str<'_>, indices: &mut Vec<u32> ) -> Option<u16>

Checks whether needle and haystack match exactly and compute the matches indices.

This functions has O(n) time complexity.

See the matcher documentation for more details.

source

pub fn prefix_match( &mut self, haystack: Utf32Str<'_>, needle: Utf32Str<'_> ) -> Option<u16>

Checks whether needle is a prefix of the haystack.

This functions has O(n) time complexity.

See the matcher documentation for more details.

source

pub fn prefix_indices( &mut self, haystack: Utf32Str<'_>, needle: Utf32Str<'_>, indices: &mut Vec<u32> ) -> Option<u16>

Checks whether needle is a prefix of the haystack and compute the matches indices.

This functions has O(n) time complexity.

See the matcher documentation for more details.

source

pub fn postfix_match( &mut self, haystack: Utf32Str<'_>, needle: Utf32Str<'_> ) -> Option<u16>

Checks whether needle is a postfix of the haystack.

This functions has O(n) time complexity.

See the matcher documentation for more details.

source

pub fn postfix_indices( &mut self, haystack: Utf32Str<'_>, needle: Utf32Str<'_>, indices: &mut Vec<u32> ) -> Option<u16>

Checks whether needle is a postfix of the haystack and compute the matches indices.

This functions has O(n) time complexity.

See the matcher documentation for more details.

Trait Implementations§

source§

impl Clone for Matcher

source§

fn clone(&self) -> Self

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl Debug for Matcher

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl Default for Matcher

source§

fn default() -> Self

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> ToOwned for T
where T: Clone,

§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.