[−][src]Module probminhash::superminhasher
An implementation of Superminhash from:
A new minwise Hashing Algorithm for Jaccard Similarity Estimation.
Otmar Ertl (2017-2018) https://arxiv.org/abs/1706.05698
The hash values can be computed before entering SuperMinHash methods
so that the structure just computes permutation according to the paper
or hashing can be delegated to the sketching method.
In the first case, the build_hasher should be parametrized by NoHashHasher
(as in finch module).
In the other case Fnv (fast when hashing small values as integer according to documentation) ,
or fxhash can be used.
Structs
NoHashHasher | This type is used to store already hashed data and implements a hash that does nothing. It just stores data (u64) inside itself |
SuperMinHash | An implementation of Superminhash A new minwise Hashing Algorithm for Jaccard Similarity Estimation Otmar Ertl 2017-2018 arXiv https://arxiv.org/abs/1706.05698 |
Functions
get_jaccard_index_estimate | returns an estimator of jaccard index between 2 sketches coming from the same SuperMinHash struct (using reinit for example) or two SuperMinHash struct initialized with same parameters. |