[−][src]Crate fmlrc
FM-Index Long Read Corrector v2
This library provides access to the functionality used by FMLRC2 to perform read correction using a Burrows Wheeler Transform (BWT). Currently, the BWT is assumed to have been generated externally (typically with a tool like ropebwt2) and stored in the same numpy format as FMLRC v1. FMLRC load a binary representation of the BWT into memory for performing very fast queries at the cost of memory usage. This particular implementation is accelerated over FMLRC v1 by using a cache to pre-compute common queries to the BWT.
Example
use fmlrc::bv_bwt::BitVectorBWT; use fmlrc::bwt_converter::convert_to_vec; use fmlrc::ropebwt2_util::create_bwt_from_strings; use fmlrc::string_util::convert_stoi; use std::io::Cursor; //example with in-memory BWT let data: Vec<&str> = vec!["ACGT", "CCGG"]; let seq = create_bwt_from_strings(&data).unwrap(); let cursor_seq = Cursor::new(seq); let vec_form = convert_to_vec(cursor_seq); let mut bwt = BitVectorBWT::new(); bwt.load_vector(vec_form); //bwt.load_numpy_file(filename); <- if in a numpy file //do a count let kmer: Vec<u8> = convert_stoi(&"ACGT"); let kmer_count = bwt.count_kmer(&kmer); //ACGT assert_eq!(kmer_count, 1);
Modules
bv_bwt | Contains the bit vector implementation of the BWT |
bwt_converter | Contains the function for reformating a BWT string into the expected run-length format or numpy file |
indexed_bit_vec | Contains bit vector with basic rank support; other crates exist with this, but they tended to be slow for some reason |
ordered_fasta_writer | Contains a wrapper around the rust-bio FASTA writer, but forces an ordering on the reads |
read_correction | Contains the logic for performing the read correction |
ropebwt2_util | Contains wrapper functions for |
stats_util | Contains special statistics functions, mainly an ignored median score |
string_util | Contains inline functions for converting between strings and integer formats |