Expand description
K-mers and associated operations.
This library provides functionality for extracting k-mers from sequences,
and manipulating them in useful ways. The underlying representation is
64-bit integers (u64
), so k > 32 is not supported by this library.
K-mers (or q-grams in some computer science contexts) are k-length sequences of DNA/RNA “letters” represented as unsigned integers. Following usual practice,
- “A” -> b00
- “C” -> b01
- “G” -> b10
- “T” or “U” -> b11
which has the nice property that the complementary bases are bitwise complements.
Structs§
- BigSparse
Accumulator - A sparse k-mer accumulator for large set of k-mers.
- Canonical
Hash - Canonicalisation based on hashing.
- Canonical
Lex - A simple canonicalisation based on lexicographic ordering
- Compressed
Kmer Frequency List - A compressed representation for a sorted list of k-mer and count pairs.
- Compressed
Kmer Frequency List Iterator - Iterate over a compressed k-mer list
- Compressed
Kmer List - A compressed representation for a sorted list of k-mers.
- Compressed
Kmer List Iterator - Iterate over a compressed k-mer list
- Counting
Iterator - Take a sorted iterator and yield k-mer frequencies.
- Dense
Accumulator - Accumulate k-mer frequencies for small to medium k.
- Iterator
Read Adaptor - Convert an iterator over bytes to a
Read
. - Kmer
- A k-length nucleotide sequence represented as a 64-bit integer.
- Kmer
Frequency Iterator - An iterator producing k-mer frequencies.
- Kmer
Iterator - An iterator over the k-mers drawn from a sequence.
- Merge
Iterator - Merge two k-mer frequency iterators.
- Simple
PosIndex - A simple position index for k-mers.
- Simple
SeqPos Index - A simple index allowing multiple sequences to be distinguished in the index.
- Simple
Sparse Accumulator - A simple sparse accumulator
- Unique
- An iterator producing k-mer frequencies.
Traits§
- Canonical
- A Trait for capturing k-mer canonicalisation.
Functions§
- counting_
kmer_ frequency_ iterator - Take a sorted iterator and yield k-mer frequencies.
- dot
- Compute the dot product between two k-mer frequency spectra.
- frequency_
vector_ iter - Construct a k-mer frequency iterator from an iterator over frequencies.
- jaccard
- Compute the Jaccard cooeficient from two sets of k-mers.
- merge
- Merge two iterators
- unique
- An adaptor that coverts an iterator over k-mers into a k-mer frequency iterator