pub fn correct_read_errors(
    reads: &Tile,
    split_margin: usize,
    hd_margin: usize
) -> Result<Vec<(Sequence, Sequence)>>
Expand description

Returns tuples with error corrections that can be applied to a set of reads

Genome sequencers use a chemical procedure to obtain reads from provided biomaterial. Due to the error-prone character of this approach, multiple reads of the same region are usually taken. Errors can be corrected by splitting the obtained set into correct and faulty reads first. If a read or its complement appears in the set at least split_margin times, it is regarded as correct. Faulty reads are corrected based on their hamming distance to one of the correct reads.

Arguments

  • reads - tile containing reads
  • split_margin - number of reads/complements required to treat a read as correct
  • hd_margin - hamming distance used for matching faulty reads to correct ones

Example

use biogarden::processing::transformers::correct_read_errors;
use biogarden::ds::sequence::Sequence;
use biogarden::ds::tile::Tile;

let mut reads = Tile::new();
reads.push(Sequence::from("TTCAT"));
reads.push(Sequence::from("TGAAA"));
reads.push(Sequence::from("GAGGA"));
reads.push(Sequence::from("ATCAA"));
reads.push(Sequence::from("TTGAT"));

correct_read_errors(&reads, 2, 1);