Expand description
This crate implements Fuzzy Searching with trigrams
Fuzzy searching allows to compare strings by similarity rather than by equality:
Similar strings will get a high score (close to 1.0f32
) while dissimilar strings will get a lower score (closer to 0.0f32
).
Fuzzy searching tolerates changes in word order:
ex. "John Dep"
and "Dep John"
will get a high score.
The crate exposes 5 main functions:
- fuzzy_compare will take 2 strings and return a score representing how similar those strings are.
- fuzzy_search applies fuzzy_compare to a list of strings and returns a list of tuples: (word, score).
- fuzzy_search_sorted is similar to fuzzy_search but orders the output in descending order.
- fuzzy_search_threshold will take an additional
f32
as input and returns only tuples with score greater than the threshold. - fuzzy_search_best_n will take an additional
usize
arguments and returns the firstn
tuples.
The Algorithm used is taken from : https://dev.to/kaleman15/fuzzy-searching-with-postgresql-97o
Basic idea:
-
From both strings extracts all groups of 3 adjacent letters.
("House"
becomes[' H', ' Ho', 'Hou', 'ous', 'use', 'se ']
).
Note the 2 spaces added to the head of the string and the one on the tail, used to make the algorithm work on zero length words. -
Then counts the number of trigrams of the first words that are also present on the second word and divide by the number of trigrams of the first word.\
Example: Comparing 2 strings
fn test () {
use rust_fuzzy_search::fuzzy_compare;
let score : f32 = fuzzy_compare("kolbasobulko", "kolbasobulko");
println!("score = {:?}", score);
}
Example: Comparing a string with a list of strings and retrieving only the best matches
fn test() {
use rust_fuzzy_search::fuzzy_search_best_n;
let s = "bulko";
let list : Vec<&str> = vec![
"kolbasobulko",
"sandviĉo",
"ŝatas",
"domo",
"emuo",
"fabo",
"fazano"
];
let n : usize = 3;
let res : Vec<(&str, f32)> = fuzzy_search_best_n(s,&list, n);
for (_word, score) in res {
println!("{:?}",score)
}
}
Example: if you have a Vec
of String
s you need to convert it to a list of &str
fn works_with_strings() {
use rust_fuzzy_search::fuzzy_search;
let s = String::from("varma");
let list: Vec<String> = vec![String::from("varma vetero"), String::from("varma ĉokolado")];
fuzzy_search(&s, &list.iter().map(String::as_ref).collect::<Vec<&str>>());
}
Functions§
- Use this function to compare 2 strings.
- Use this function to compare a string (
&str
) with all elements of a list. - This function is similar to fuzzy_search_sorted but keeps only the
n
best items, those with a better match. - This function is similar to fuzzy_search but sorts the result in descending order (the best matches are placed at the beginning).
- This function is similar to fuzzy_search but filters out element with a score lower than the specified one.