Expand description
UMASH is an almost-universal family of hash functions. Each
Params struct defines a specific hash function; when the
parameters are generated pseudorandomly, the probability that two
different inputs of up to s bytes collide (for an independently
generated set of parameters) is at most ceil(s / 4096) 2^-55 for
the 64-bit hash. The 128-bit fingerprint reduces that probability
to less than 2^-70 for inputs of 1 GB or less.
See the reference repo for more details and proofs.
Structs§
- Fingerprint
- A 128-bit
Fingerprintis constructed by UMASH as if we had computed theUmashComponent::HashandUmashComponent::Secondaryfunctions, and stored them in theFingerprint::hasharray in order. - Fingerprinter
- A
Fingerprinterimplements the 128-bit fingerprinting function defined by a specificParamsstruct, further tweaked by a seed. ConstructFingerprinters withParams::fingerprinter. - Hasher
- A
Hasherimplements one of the two hash 64-bit functions defined by a specificParamsstruct, further tweaked by a seed. ConstructHashers withParams::hasher,Params::secondary_hasher, orParams::component_hasher. - Params
- A
Paramsstores a set of hashing parameters that define a specific UMASH function.
Enums§
- Umash
Component - A given
Paramsstruct defines a pair of 64-bit hash functions. TheUmashComponent::Hashis the primary hash value; we find a 128-bit fingerprint by combining that primary value with theUmashComponent::Secondaryhash (in practice, it’s more efficient to compute both hash values concurrently, when one knows they want a fingerprint).