Expand description
ntHash is a hash function tuned for genomic data. It performs best when calculating hash values for adjacent k-mers in an input sequence, operating an order of magnitude faster than the best performing alternatives in typical use cases.
Scientific article with more details
Original implementation in C++
This crate is based on ntHash 1.0.4.
Structs§
- NtHash
Forward Iterator - An efficient iterator for calculating hashes for genomic sequences. This returns the forward hashes, not the canonical hashes.
- NtHash
Iterator - An efficient iterator for calculating hashes for genomic sequences.
Enums§
Functions§
- ntc64
- Calculate the canonical hash (minimum hash value between the forward and reverse strands in a sequence).
- ntf64
- Calculate the hash for a k-mer in the forward strand of a sequence.
- nthash
- Takes a sequence and ksize and returns the canonical hashes for each k-mer in a Vec. This doesn’t benefit from the rolling hash properties of ntHash, serving more for correctness check for the NtHashIterator.
- ntr64
- Calculate the hash for a k-mer in the reverse strand of a sequence.