umash

Struct Params

Source
pub struct Params(/* private fields */);
Expand description

A Params stores a set of hashing parameters that define a specific UMASH function.

By default, each Params is generated independently with unique pseudorandom parameters. Call Params::derive to generate repeatable parameters that will compute the same hash and fingerprint values across processes, programs, architectures, and UMASH versions.

While the derivation algorithm and the UMASH hash functions are fully defined independently of the platform, the std::hash::Hash functions may not be. When computing hash or fingerprint values that will be persisted or compared in different processes (or programs, or machines), it is safest to directly pass &[u8] bytes to Hasher::write or Fingerprinter::write or to their std::io::Write implementation.

This struct consists of 38 u64 parameters, so although Params do not own any resource, they should be passed by reference rather than copied, as much as possible.

Params implement std::hash::BuildHasher: pass a &'static Params to, e.g., std::collections::HashMap::with_hasher, and the hash values will be computed with as the primary UMASH value for these Params, for seed = 0.

Implementations§

Source§

impl Params

Source

pub fn new() -> Self

Returns a new pseudo-unique Params value.

Source

pub fn derive(bits: u64, key: &[u8]) -> Self

Returns a fresh set of Params derived deterministically from bits and the first 32 bytes in key.

The UMASH function defined by the resulting Params will remain the same for all versions of UMASH and umash-rs.

Examples found in repository?
examples/umash.rs (line 11)
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
fn main() {
    let args: Vec<String> = std::env::args().collect();
    let input = args
        .get(1)
        .cloned()
        .unwrap_or_else(|| "default input".to_string());
    let seed = 42u64;
    let my_params = umash::Params::derive(0, "hello example.c".as_bytes());
    let fprint = my_params
        .fingerprinter(seed)
        .write(input.as_bytes())
        .digest();

    println!("Input: {}", input);
    println!("Fingerprint: {:x}, {:x}", fprint.hash[0], fprint.hash[1]);
    println!(
        "Hash 0: {:x}",
        my_params.hasher(seed).write(input.as_bytes()).digest()
    );
    println!(
        "Hash 1: {:x}",
        my_params
            .secondary_hasher(seed)
            .write(input.as_bytes())
            .digest()
    );

    let mut h: umash::Hasher = (&my_params).into();
    h.write(input.as_bytes());
    println!("Hash: {:x}", h.finish());
}
Source

pub fn hasher(&self, seed: u64) -> Hasher<'_>

Returns a Hasher for the primary UMASH function.

The seed tweaks the hash value without any proven impact on collision rates for different seed values.

Examples found in repository?
examples/umash.rs (line 21)
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
fn main() {
    let args: Vec<String> = std::env::args().collect();
    let input = args
        .get(1)
        .cloned()
        .unwrap_or_else(|| "default input".to_string());
    let seed = 42u64;
    let my_params = umash::Params::derive(0, "hello example.c".as_bytes());
    let fprint = my_params
        .fingerprinter(seed)
        .write(input.as_bytes())
        .digest();

    println!("Input: {}", input);
    println!("Fingerprint: {:x}, {:x}", fprint.hash[0], fprint.hash[1]);
    println!(
        "Hash 0: {:x}",
        my_params.hasher(seed).write(input.as_bytes()).digest()
    );
    println!(
        "Hash 1: {:x}",
        my_params
            .secondary_hasher(seed)
            .write(input.as_bytes())
            .digest()
    );

    let mut h: umash::Hasher = (&my_params).into();
    h.write(input.as_bytes());
    println!("Hash: {:x}", h.finish());
}
Source

pub fn secondary_hasher(&self, seed: u64) -> Hasher<'_>

Returns a Hasher for the secondary UMASH function.

The seed tweaks the hash value without any proven impact on collision rates for different seed values.

Examples found in repository?
examples/umash.rs (line 26)
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
fn main() {
    let args: Vec<String> = std::env::args().collect();
    let input = args
        .get(1)
        .cloned()
        .unwrap_or_else(|| "default input".to_string());
    let seed = 42u64;
    let my_params = umash::Params::derive(0, "hello example.c".as_bytes());
    let fprint = my_params
        .fingerprinter(seed)
        .write(input.as_bytes())
        .digest();

    println!("Input: {}", input);
    println!("Fingerprint: {:x}, {:x}", fprint.hash[0], fprint.hash[1]);
    println!(
        "Hash 0: {:x}",
        my_params.hasher(seed).write(input.as_bytes()).digest()
    );
    println!(
        "Hash 1: {:x}",
        my_params
            .secondary_hasher(seed)
            .write(input.as_bytes())
            .digest()
    );

    let mut h: umash::Hasher = (&my_params).into();
    h.write(input.as_bytes());
    println!("Hash: {:x}", h.finish());
}
Source

pub fn component_hasher(&self, seed: u64, which: UmashComponent) -> Hasher<'_>

Returns a Hasher for the desired UMASH component (primary hash or secondary function).

The seed tweaks the hash value without any proven impact on collision rates for different seed values.

Source

pub fn fingerprinter(&self, seed: u64) -> Fingerprinter<'_>

Returns a Fingerprinter for the UMASH function.

The seed tweaks the hash value without any proven impact on collision rates for different seed values.

Examples found in repository?
examples/umash.rs (line 13)
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
fn main() {
    let args: Vec<String> = std::env::args().collect();
    let input = args
        .get(1)
        .cloned()
        .unwrap_or_else(|| "default input".to_string());
    let seed = 42u64;
    let my_params = umash::Params::derive(0, "hello example.c".as_bytes());
    let fprint = my_params
        .fingerprinter(seed)
        .write(input.as_bytes())
        .digest();

    println!("Input: {}", input);
    println!("Fingerprint: {:x}, {:x}", fprint.hash[0], fprint.hash[1]);
    println!(
        "Hash 0: {:x}",
        my_params.hasher(seed).write(input.as_bytes()).digest()
    );
    println!(
        "Hash 1: {:x}",
        my_params
            .secondary_hasher(seed)
            .write(input.as_bytes())
            .digest()
    );

    let mut h: umash::Hasher = (&my_params).into();
    h.write(input.as_bytes());
    println!("Hash: {:x}", h.finish());
}
Source

pub fn hash(&self, object: impl Hash) -> u64

Computes the UmashComponent::Hash value defined by this set of UMASH params for object and seed = 0.

UMASH’s collision probability bounds only hold if different objects feed different byte streams to the hasher. The standard Rust [std::Hash::hash] implementations (and automatically generated ones) satisfy this requirement for values of the same type.

Source

pub fn secondary(&self, object: impl Hash) -> u64

Computes the UmashComponent::Secondary hash value defined by this set of UMASH params for object and seed = 0.

UMASH’s collision probability bounds only hold if different objects feed different byte streams to the hasher. The standard Rust [std::Hash::hash] implementations (and automatically generated ones) satisfy this requirement for values of the same type.

Source

pub fn fingerprint(&self, object: impl Hash) -> Fingerprint

Computes the fingerprint value defined by this set of UMASH params for object and seed = 0.

UMASH’s collision probability bounds only hold if different objects feed different byte streams to the fingerprinter. The standard Rust [std::Hash::hash] implementations (and automatically generated ones) satisfy this requirement for values of the same type.

Trait Implementations§

Source§

impl<'params> BuildHasher for &'params Params

A reference to a Params struct may be passed to hashed collections. The collection will use hashers derived from that static set of parameters (with seed = 0).

Unfortunately, due to lifetime complications, it’s not clear how to make each hashed collection generate a new Default Params: we want a guarantee that hashers will never outlive their parent builder.

Source§

type Hasher = Hasher<'params>

Type of the hasher that will be created.
Source§

fn build_hasher(&self) -> Hasher<'params>

Creates a new hasher. Read more
1.71.0 · Source§

fn hash_one<T>(&self, x: T) -> u64
where T: Hash, Self: Sized, Self::Hasher: Hasher,

Calculates the hash of a single value. Read more
Source§

impl Clone for Params

Source§

fn clone(&self) -> Params

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Default for Params

The default constructor for Params returns a fresh unique set of parameters.

Source§

fn default() -> Self

Returns the “default value” for a type. Read more
Source§

impl<'params> From<&'params Params> for Fingerprinter<'params>

Converts a &Params to Fingerprinter by constructing a fresh Fingerprinter for these Params and seed = 0.

Source§

fn from(params: &'params Params) -> Fingerprinter<'params>

Converts to this type from the input type.
Source§

impl<'params> From<&'params Params> for Hasher<'params>

Converts a &Params to Hasher by constructing a fresh Hasher for these Params and seed = 0.

Source§

fn from(params: &'params Params) -> Hasher<'params>

Converts to this type from the input type.

Auto Trait Implementations§

§

impl Freeze for Params

§

impl RefUnwindSafe for Params

§

impl Send for Params

§

impl Sync for Params

§

impl Unpin for Params

§

impl UnwindSafe for Params

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.