[][src]Crate fmlrc

FM-Index Long Read Corrector v2

This library provides access to the functionality used by FMLRC2 to perform read correction using a Burrows Wheeler Transform (BWT). Currently, the BWT is assumed to have been generated externally (typically with a tool like ropebwt2) and stored in the same numpy format as FMLRC v1. FMLRC load a binary representation of the BWT into memory for performing very fast queries at the cost of memory usage. This particular implementation is accelerated over FMLRC v1 by using a cache to pre-compute common queries to the BWT.

Example

use fmlrc::bv_bwt::BitVectorBWT;
use fmlrc::bwt_converter::convert_to_vec;
use fmlrc::ropebwt2_util::create_bwt_from_strings;
use fmlrc::string_util::convert_stoi;
use std::io::Cursor;

//example with in-memory BWT
let data: Vec<&str> = vec!["ACGT", "CCGG"];
let seq = create_bwt_from_strings(&data).unwrap();
let cursor_seq = Cursor::new(seq);
let vec_form = convert_to_vec(cursor_seq);
let mut bwt = BitVectorBWT::new();
bwt.load_vector(vec_form);
//bwt.load_numpy_file(filename); <- if in a numpy file

//do a count
let kmer: Vec<u8> = convert_stoi(&"ACGT");
let kmer_count = bwt.count_kmer(&kmer); //ACGT
assert_eq!(kmer_count, 1);

Modules

bv_bwt

Contains the bit vector implementation of the BWT

bwt_converter

Contains the function for reformating a BWT string into the expected run-length format or numpy file

indexed_bit_vec

Contains bit vector with basic rank support; other crates exist with this, but they tended to be slow for some reason

ordered_fasta_writer

Contains a wrapper around the rust-bio FASTA writer, but forces an ordering on the reads

read_correction

Contains the logic for performing the read correction

ropebwt2_util

Contains wrapper functions for ropebwt2, most will fail if ropebwt2 is not on the PATH

stats_util

Contains special statistics functions, mainly an ignored median score

string_util

Contains inline functions for converting between strings and integer formats