Module bio::pattern_matching::ukkonen [] [src]

Bounded version of Ukkonens DP algorithm for approximate pattern matching. Complexity: O(n * k) on random texts.

The algorithm finds all matches of a pattern in a text with up to k errors. Idea is to use dynamic programming to column-wise explore the edit matrix, but to omit parts of the matrix for which the error exceeds k. To achieve this, a value lastk is maintained that provides the lower feasible boundary of the matrix. Initially, lastk = min(k, m). In each iteration (over a column), lastk can increase by at most 1.

Example

use bio::pattern_matching::ukkonen::{Ukkonen, unit_cost};

let mut ukkonen = Ukkonen::with_capacity(10, unit_cost);
let text = b"ACCGTGGATGAGCGCCATAG";
let pattern = b"TGAGCGA";
let occ: Vec<(usize, usize)> = ukkonen.find_all_end(pattern, text, 1).collect();
assert_eq!(occ, [(13, 1), (14, 1)]);

Structs

Matches

Iterator over pairs of end positions and distance of matches.

Ukkonen

Ukkonens algorithm.

Functions

unit_cost

Default cost function (unit costs).