[][src]Module probminhash::superminhasher

An implementation of Superminhash from:
A new minwise Hashing Algorithm for Jaccard Similarity Estimation.
Otmar Ertl (2017-2018) https://arxiv.org/abs/1706.05698

The hash values can be computed before entering SuperMinHash methods so that the structure just computes permutation according to the paper or hashing can be delegated to the sketching method.
In the first case, the build_hasher should be parametrized by NoHashHasher (as in finch module).
In the other case Fnv (fast when hashing small values as integer according to documentation) , or fxhash can be used.

Structs

NoHashHasher

This type is used to store already hashed data and implements a hash that does nothing. It just stores data (u64) inside itself

SuperMinHash

An implementation of Superminhash A new minwise Hashing Algorithm for Jaccard Similarity Estimation Otmar Ertl 2017-2018 arXiv https://arxiv.org/abs/1706.05698

Functions

get_jaccard_index_estimate

returns an estimator of jaccard index between 2 sketches coming from the same SuperMinHash struct (using reinit for example) or two SuperMinHash struct initialized with same parameters.