quadrank 0.1.0

Fast rank over binary and size-4 DNA alphabets.
docs.rs failed to build quadrank-0.1.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: quadrank-0.2.0

QuadRank

This repo implements BiRank, QuadRank, and QuadFm, two fast rank data structures and a simple count-only FM-index that use batching and prefetching of queries.

QuadFm is up to 4x faster than genedex (https://github.com/feldroop/genedex), which seems to be the fastest Rust-based FM-index currently.

NOTE: The code here is not really ready yet for consumption as a library:

  • It uses a lot of nightly features (such as const generics) that make development easier, but should be stripped away now. In fact, you'll have to use an old nightly, e.g. nightly-2025-11-01.
  • Only AVX2 is supported currently.
  • The API still needs cleaning up.
  • Docs still need to be written for docs.rs.

BiRank

Comparison plot, showing that BiRank variants are smaller and faster than others.

QuadRank

Comparison plot, showing that BiRank variants are smaller and faster than others.

FM-index

Here I'm mapping simulated 150bp short reads with 1% error rate (see examples/short_reads.rs) against a 3.1 Gbp human genome. I first build each index on the forward data (where I don't care about time/space usage), and then count the number of matches of each fwd/rc read. For genedex and quad, I query batches of 32 reads at a time. I'm using 12 threads, on my 6-core i7-10750H, fixed at 3.0 GHz.

Comparison plot, showing that QuadFm is smaller and faster than others.

Benchmarks

This directory contains the quadrank crate implementing BiRank and QuadRank and variants. Synthetic benchmarks are run using cargo run -r -F ext --example bench -- -j -b > evals/data.csv.

The fm-index directory contains QuadFm. It is evaluated by running cargo run -r -- <human-genome>.fa <reads>.fa > ../evals/fm.csv.

Plotting code can be found in evals/plot.py and evals/plot-fm.py.