Expand description

ntHash is a hash function tuned for genomic data. It performs best when calculating hash values for adjacent k-mers in an input sequence, operating an order of magnitude faster than the best performing alternatives in typical use cases.

Scientific article with more details

Original implementation in C++

This crate is based on ntHash 1.0.4.

Structs

An efficient iterator for calculating hashes for genomic sequences. This returns the forward hashes, not the canonical hashes.

An efficient iterator for calculating hashes for genomic sequences.

Enums

Functions

Calculate the canonical hash (minimum hash value between the forward and reverse strands in a sequence).

Calculate the hash for a k-mer in the forward strand of a sequence.

Takes a sequence and ksize and returns the canonical hashes for each k-mer in a Vec. This doesn’t benefit from the rolling hash properties of ntHash, serving more for correctness check for the NtHashIterator.

Calculate the hash for a k-mer in the reverse strand of a sequence.

Type Definitions