[−][src]Module bio::pattern_matching::ukkonen
Bounded version of Ukkonens DP algorithm for approximate pattern matching. Complexity: O(n * k) on random texts.
The algorithm finds all matches of a pattern in a text with up to k errors.
Idea is to use dynamic programming to column-wise explore the edit matrix, but to omit
parts of the matrix for which the error exceeds k. To achieve this, a value lastk
is
maintained that provides the lower feasible boundary of the matrix.
Initially, lastk = min(k, m). In each iteration (over a column), lastk can increase by at most 1.
Example
use bio::pattern_matching::ukkonen::{Ukkonen, unit_cost}; let mut ukkonen = Ukkonen::with_capacity(10, unit_cost); let text = b"ACCGTGGATGAGCGCCATAG"; let pattern = b"TGAGCGA"; let occ: Vec<(usize, usize)> = ukkonen.find_all_end(pattern, text, 1).collect(); assert_eq!(occ, [(13, 1), (14, 1)]);
Structs
Matches | Iterator over pairs of end positions and distance of matches. |
Ukkonen | Ukkonens algorithm. |
Functions
unit_cost | Default cost function (unit costs). |