Skip to main content

Crate frizbee

Crate frizbee 

Source
Expand description

Frizbee is a SIMD typo-resistant fuzzy string matcher written in Rust. The core of the algorithm uses Smith-Waterman with affine gaps, similar to FZF, but with many of the scoring bonuses from FZY. In the included benchmark, with typo resistance disabled, it outperforms nucleo by ~1.7x and fzf by ~7x and supports multithreading, see benchmarks. It matches against bytes directly, ignoring unicode.

Used by blink.cmp, skim, and fff.nvim. Special thank you to stefanboca and ii14!

The core of the algorithm is Smith-Waterman with affine gaps and row-wise parallelism via SIMD. Besides the parallelism, this is the basis of other popular fuzzy matching algorithms like FZF and Nucleo. The main properties of Smith-Waterman are:

  • Always finds the best alignment
  • Supports insertion, deletion and substitution

§Example: using match_list

use frizbee::{match_list, match_list_parallel, Config};

let needle = "fBr";
let haystacks = ["fooBar", "foo_bar", "prelude", "println!"];

let matches = match_list(needle, &haystacks, &Config::default());
// or in parallel (8 threads)
let matches = match_list_parallel(needle, &haystacks, &Config::default(), 8);

§Example: using Matcher

Useful for when you want to match one needle against more than one haystack.

use frizbee::{Matcher, Config};

let needle = "fBr";
let haystacks = ["fooBar", "foo_bar", "prelude", "println!"];

let mut matcher = Matcher::new(needle, &Config::default());
let matches = matcher.match_list(&haystacks);

§Example: custom scoring offsets

If you want to apply a custom score before sorting the results, you may use the match_iter API on the Matcher.

use frizbee::{Match, Matcher, Config};

let needle = "fBr";
let haystacks = ["fooBar", "foo_bar", "prelude", "println!"];

let mut matcher = Matcher::new(needle, &Config::default());
let mut matches = matcher.match_iter(&haystacks)
    .map(|m| Match { score: m.score + 10, ..m })
    .collect::<Vec<_>>();
matches.sort_unstable();

§Example: custom scoring based on alignment path

If you want to apply a custom score based on which characters were matched, you may drop down to the lower level API

use frizbee::{Config, Match, Matcher, smith_waterman::Alignment};

let needle = "fBr";
let haystacks = ["fooBar", "foo_bar", "prelude", "println!"];

// Guarding against empty needle is required by `prefilter_iter()`
if needle.is_empty() {
      return
      // return (0..haystacks.len())
      //   .map(|index| Match {
      //       index: index as u32,
      //       score: 0,
      //       exact: false,
      //   })
      //   .collect();
}


let config = Config::default();
let mut matcher = Matcher::new(needle, &config);
let mut matches = matcher.prefilter_iter(&haystacks)
    .filter_map(|(index, haystack, skipped_chunks)| {
        let mut score = matcher
            .smith_waterman
            .score_haystack(haystack);

        for alignment in matcher.iter_alignment_path(skipped_chunks, score) {
             // Return None if Alignment is None (max typos exceeded)
             match alignment? {
                Alignment::Match((needle_idx, haystack_idx)) => {
                    // adjust score as desired
                }
                // optionally handle Left, Up and Mismatch
                _ => {}
             }
        }

        let exact = skipped_chunks == 0 && needle.as_bytes() == haystack;
        if exact {
            score += config.scoring.exact_match_bonus;
        }

        Some(Match {
            index: index as u32,
            score,
            exact,
        })
    })
    .collect::<Vec<_>>();

matches.sort_unstable();

Modules§

prefilter
Fast prefiltering algorithms, which run before Smith Waterman since in the typical case, a small percentage of the haystack will match the needle. Automatically used by the Matcher and match_list APIs.
smith_waterman
The Smith Waterman algorithm performs local sequence alignment (explanation), originally designed to find similar sequences between two DNA strings. Guaranteed to find the optimal alignment and supports typos.

Structs§

Config
Match
MatchIndices
Matcher
Scoring

Functions§

match_list
match_list_indices
match_list_parallel