Crate kmers_rs

source ·
Expand description

K-mers and associated operations.

This library provides functionality for extracting k-mers from sequences, and manipulating them in useful ways. The underlying representation is 64-bit integers (u64), so k > 32 is not supported by this library.

K-mers (or q-grams in some computer science contexts) are k-length sequences of DNA/RNA “letters” represented as unsigned integers. Following usual practice,

  • “A” -> b00
  • “C” -> b01
  • “G” -> b10
  • “T” or “U” -> b11

which has the nice property that the complementary bases are bitwise complements.

Structs

Functions

  • Take a sorted iterator and yield k-mer frequencies.
  • Compute the dot product between two k-mer frequency spectra.
  • Construct a k-mer frequency iterator from an iterator over frequencies.
  • Compute the Jaccard cooeficient from two sets of k-mers.
  • Merge two iterators
  • An adaptor that coverts an iterator over k-mers into a k-mer frequency iterator