Function spelling::spellcheck [−][src]
pub fn spellcheck<'a>(
dictionary_string: &'a str,
word: &str,
distance: usize
) -> Vec<&'a str>
Takes a dictionary_string
(newline separated), a word and a distance and
returns a vector of possible matches, with a limit of distance set up
distance
. Sorts by distance. This doesn’t use rayon.
Notes:
- This only works for single length code bytes.
- This uses the Levenshtein distance.
use spelling::spellcheck; let dictionary_string = include_str!("words.txt"); // newline separated spellcheck(dictionary_string, "restaraunt", 3);
How it works:
It loops through each word in the dictionary and then you have a word from the dictionary and a word to match against.
Lets say those words were rumor
and rotten
. You then create a vector
that is the length of the shorter word + 1, counting up (but the first value
is one).
let mut list = vec![1, 1, 2, 3, 4, 5];
You also keep track a left variable, which starts at the index of the longest string you’re at and you create a temporary row.
for x in 2..longer.len() { let mut left = x; let mut temp = Vec::new(); }
For every item in the list, except the first, check if letter you’re on
is the same in both words. If it is left will become list at the current
index you’re on minus one. If it isn’t left will become the minimum of
the current index on list
, the current index minus on list
, and left.
Then add left
to temp
.
After this set list
to temp
and repeat.
The distance will be in the last index of list at the end, so return that to filter.
for x in 2..longer.len() { let mut left = x; println!("{}", x); let mut temp = Vec::new(); temp.push(left); let mut iter = list.iter().enumerate(); iter.next(); // skip first item in list for (index, y) in iter { println!("{}", index); left = match longer.as_bytes()[x - 1] == shorter.as_bytes()[index - 1] { true => list[index - 1], false => [list[index - 1], *y, left].iter().min().unwrap() + 1, }; temp.push(left) } list = temp; }