Module bio::pattern_matching::ukkonen
source · Expand description
Bounded version of Ukkonens DP algorithm for approximate pattern matching. Complexity: O(n * k) on random texts.
The algorithm finds all matches of a pattern in a text with up to k errors.
Idea is to use dynamic programming to column-wise explore the edit matrix, but to omit
parts of the matrix for which the error exceeds k. To achieve this, a value lastk
is
maintained that provides the lower feasible boundary of the matrix.
Initially, lastk = min(k, m). In each iteration (over a column), lastk can increase by at most 1.
Example
use bio::pattern_matching::ukkonen::{Ukkonen, unit_cost};
let mut ukkonen = Ukkonen::with_capacity(10, unit_cost);
let text = b"ACCGTGGATGAGCGCCATAG";
let pattern = b"TGAGCGA";
let occ: Vec<(usize, usize)> = ukkonen.find_all_end(pattern, text, 1).collect();
assert_eq!(occ, [(13, 1), (14, 1)]);
Structs
Functions
Default cost function (unit costs).