LtFmIndex
LtFmIndex
is a Rust library for building and using a FM-index that contains a lookup table of the first k-mer of a pattern. This index can be used to (1) count the number of occurrences and (2) locate the positions of a pattern in an indexed text.
Usage
Add to dependency
To use this library, add lt_fm_index
to your Cargo.toml
:
[]
= "0.7.0-alpha"
- About
fastbwt
features- This feature can accelerate the indexing, but needs
cmake
to buildlibdivsufsort
and cannot be built as WASM.
- This feature can accelerate the indexing, but needs
Example code
use LtFmIndex;
use Block2; // `Block2` can index 3 types of characters.
// (1) Define characters to use
let characters_by_index: & = &;
// Alternatively, you can use this simpler syntax:
let characters_by_index: & = &;
// (2) Build index
let text = b"CTCCGTACACCTGTTTCGTATCGGAXXYYZZ".to_vec;
let lt_fm_index= build.unwrap;
// (3) Match with pattern
let pattern = b"TA";
// - count
let count = lt_fm_index.count;
assert_eq!;
// - locate
let mut locations = lt_fm_index.locate;
locations.sort; // The locations may not be in order.
assert_eq!;
// All unindexed characters are treated as the same character.
// In the text, X, Y, and Z can match any other unindexed character
let mut locations = lt_fm_index.locate;
locations.sort;
// Using the b"XXXXX", b"YYYYY", or b"!@#$%" gives the same result.
assert_eq!;
// (4) Save and load
let mut buffer = Vec new;
lt_fm_index.save_to.unwrap;
let loaded = load_from.unwrap;
assert_eq!;
Repository
https://github.com/baku4/lt-fm-index
API Doc
Reference
- Ferragina, P., et al. (2004). An Alphabet-Friendly FM-Index, Springer Berlin Heidelberg: 150-160.
- Anderson, T. and T. J. Wheeler (2021). An optimized FM-index library for nucleotide and amino acid search, Cold Spring Harbor Laboratory.
- Wang, Y., X. Li, D. Zang, G. Tan and N. Sun (2018). Accelerating FM-index Search for Genomic Data Processing, ACM.
- Yuta Mori.
libdivsufsort