Expand description
§FM-Index Long Read Corrector v2
This library provides access to the functionality used by FMLRC2 to perform read correction using a Burrows Wheeler Transform (BWT). Currently, the BWT is assumed to have been generated externally (typically with a tool like ropebwt2) and stored in the same numpy format as FMLRC v1. FMLRC load a binary representation of the BWT into memory for performing very fast queries at the cost of memory usage. This particular implementation is accelerated over FMLRC v1 by using a cache to pre-compute common queries to the BWT.
§Example
use fmlrc::bv_bwt::BitVectorBWT;
use fmlrc::bwt_converter::convert_to_vec;
use fmlrc::ropebwt2_util::create_bwt_from_strings;
use fmlrc::string_util::convert_stoi;
use std::io::Cursor;
//example with in-memory BWT
let data: Vec<&str> = vec!["ACGT", "CCGG"];
let seq = create_bwt_from_strings(&data).unwrap();
let cursor_seq = Cursor::new(seq);
let vec_form = convert_to_vec(cursor_seq);
let mut bwt = BitVectorBWT::new();
bwt.load_vector(vec_form);
//bwt.load_numpy_file(filename); <- if in a numpy file
//do a count
let kmer: Vec<u8> = convert_stoi(&"ACGT");
let kmer_count = bwt.count_kmer(&kmer); //ACGT
assert_eq!(kmer_count, 1);
Modules§
- align
- Contains the alignment methods for comparing corrections
- bv_bwt
- Contains the bit vector implementation of the BWT
- bwt_
converter - Contains the function for reformating a BWT string into the expected run-length format or numpy file
- indexed_
bit_ vec - Contains bit vector with basic rank support; other crates exist with this, but they tended to be slow for some reason
- ordered_
fasta_ writer - Contains a wrapper around the rust-bio FASTA writer, but forces an ordering on the reads
- read_
correction - Contains the logic for performing the read correction
- ropebwt2_
util - Contains wrapper functions for
ropebwt2
, most will fail ifropebwt2
is not on the PATH - stats_
util - Contains special statistics functions, mainly an ignored median score
- string_
util - Contains inline functions for converting between strings and integer formats