QuadRank: a High Throughput Rank
This repo implements BiRank, QuadRank, and QuadFm, two fast rank data
structures for binary and DNA input, and a simple count-only FM-index that all use batching and prefetching of queries.
BiRank and QuadRank need only a single cache-miss per query, making them up to 2x faster than other methods in high-throughput settings. QuadFm is up to 4x faster than genedex (https://github.com/feldroop/genedex), which seems to be the fastest Rust-based FM-index currently.
NOTE: Currently only AVX2 is supported!
Usage
BiRank and QuadRank are constructed from (little-endian) packed data, given as
a &[u64] slice.
For binary input, this means the first bit is the lowest-order bit of the first
value. For QuadRank, each u64 contains 32 2-bit values.
The main data structures are BiRank and QuadRank, which implement
binary::RankerT and quad::RankerT:
Example:
let packed = ;
let rank = new;
unsafe
let dna = b"ACGCGCGACTTACGCAT";
let n = dna.len; // 17
let rank = new_ascii_dna;
unsafe
Results
BiRank
Space-time trade-off of binary rank data structures. On the x-axis is the space overhead compared to the input, and on the y-ax the ns per query. Top/middle/bottom are 1/6/12 threads, and left/middle/right are latency, throughput in a for loop, and throughput in a loop with additional prefetching.

QuadRank
Space-time trade-off of quad rank data structures.

FM-index
Here I'm mapping simulated 150bp short reads with 1% error rate (see examples/simulate_reads.rs) against a 3.1 Gbp human genome.
I first build each index on the forward data (where I don't care about time/space usage),
and then count the number of matches of each fwd/rc read.
For genedex and quad, I query batches of 32 reads at a time.
I'm using 12 threads, on my 6-core i7-10750H, fixed at 3.0 GHz.
AWRY: https://github.com/UM-Applied-Algorithms-Lab/AWRYGenedex<>: Genedex, with different choice of its rank structure.QuadFm<>: The FM-index in thefm-indexdirectory here.QuadRank*: The rank structure we introduce.qwt::RSQ{256,512}: The rank structures of https://github.com/rossanoventurini/qwt.

Replicating benchmarks
Run the synthetic BiRank and QuadRank benchmarks using:
Run the QuadFm benchmark using:
&&
where input reads can be simulated using:
&&
Each bench should take somewhere up to half an hour. Results can be converted
into plots in evals/plots/ by running evals/plot.py and evals/plot-fm.py.