Expand description

The Full-text index in Minute space index (FM-index) and the FMD-Index for finding suffix array intervals matching a given pattern in linear time.

§Examples

§Generate

use bio::alphabets::dna;
use bio::data_structures::bwt::{bwt, less, Occ};
use bio::data_structures::fmindex::{FMIndex, FMIndexable};
use bio::data_structures::suffix_array::suffix_array;

let text = b"GCCTTAACATTATTACGCCTA$";
let alphabet = dna::n_alphabet();
let sa = suffix_array(text);
let bwt = bwt(text, &sa);
let less = less(&bwt, &alphabet);
let occ = Occ::new(&bwt, 3, &alphabet);
let fm = FMIndex::new(&bwt, &less, &occ);

§Enclose in struct

FMIndex was designed to not forcibly own the BWT and auxiliary data structures. It can take a reference (&), owned structs or any of the more complex pointer types.

use bio::alphabets::dna;
use bio::data_structures::bwt::{bwt, less, Less, Occ, BWT};
use bio::data_structures::fmindex::{FMIndex, FMIndexable};
use bio::data_structures::suffix_array::suffix_array;
use bio::utils::TextSlice;

pub struct Example {
    fmindex: FMIndex<BWT, Less, Occ>,
}

impl Example {
    pub fn new(text: TextSlice) -> Self {
        let alphabet = dna::n_alphabet();
        let sa = suffix_array(text);
        let bwt = bwt(text, &sa);
        let less = less(&bwt, &alphabet);
        let occ = Occ::new(&bwt, 3, &alphabet);
        let fm = FMIndex::new(bwt, less, occ);
        Example { fmindex: fm }
    }
}

Structs§

  • A bi-interval on suffix array of the forward and reverse strand of a DNA text.
  • The FMD-Index for linear time search of supermaximal exact matches on forward and reverse strand of DNA texts (Li, 2012).
  • The Fast Index in Minute space (FM-Index, Ferragina and Manzini, 2000) for finding suffix array intervals matching a given pattern.
  • A suffix array interval.

Enums§

  • This enum represents the potential result states from a backward_search in the fm index. The potential variants of the enum are: Complete(Interval) — the query matched completely. The interval is the range of suffix array indices matching the query string. Partial(Intarval, usize) - some suffix of the query matched, but not the whole query. The interval returned is the range of suffix array indices for the maximal matching suffix, and the usize is the length of the maximal matching suffix. Absent - None suffix of the pattern matched in the text.

Traits§