Modules§
- densminhash
- Implementation of densification algorithms above One Permutation Hashing.
They provides locally sensitive sketching of unweighted data in one pass. - exp01
- sampling exponential law with restriction of domain in [0,1)
- fyshuffle
- Fisher Yates permutation generator
- invhash
- This module provides inversible hash in 32 bit and 64 bits version.
- jaccard
- jaccard distance
- nohasher
- This provides a struct implementing Hasher trait for u64 hashed values and doing nothing to use for example in counting structures when we manipulate already hashed values! (taken from finch crate)
- probminhasher
- Implementation of ProbMinHash2, ProbMinHash3 and ProbMinHash3a as described in O. Ertl
https://arxiv.org/abs/1911.00675. - setsketcher
- implementation of the paper :
SetSkectch : filling the gap between MinHash and HyperLogLog
See https://arxiv.org/abs/2101.00314 or https://vldb.org/pvldb/vol14/p2244-ertl.pdf. - superminhasher
- An implementation of Superminhash from:
A new minwise Hashing Algorithm for Jaccard Similarity Estimation.
Otmar Ertl (2017-2018) https://arxiv.org/abs/1706.05698 - superminhasher2
- An implementation of SuperMinHash2 from:
A new minwise Hashing Algorithm for Jaccard Similarity Estimation.
Otmar Ertl (2017-2018) https://arxiv.org/abs/1706.05698.
This version corresponds to the second implementation or Ertl as given in probminhash. It is as fast as the first version in super::superminhasher::SuperMinHash and returns sketch as u32 u64 which is more adapted to Hamming distane - weightedset
- trait weighted set