[][src]Struct probminhash::probminhasher::ProbMinHash3

pub struct ProbMinHash3<D, H: Hasher + Default> where
    D: Copy + Eq + Hash + Debug
{ /* fields omitted */ }

implementation of the algorithm ProbMinHash3a as described in Etrl.
It needs less memory than Probminhash3 but can be a little slower.
Probminhash3 needs at least 2 hash values to run.

The algorithms requires random generators to be initialized by objects hashed. So it must possible to associate D (at least partially) injectively to a u64 for random generator initialization hence the requirement D:H.
If all data are referred to by an unsigned integer, and weight association is given in a tuple for example data comes in a Vec<(D,f64)> then D is in fact can be replaced by the rank in the Vector, the no hash is need and you can use NoHasher

Implementations

impl<D, H> ProbMinHash3<D, H> where
    D: Copy + Eq + Debug + Hash,
    H: Hasher + Default
[src]

pub fn new(nbhash: usize, initobj: D) -> Self[src]

Allocates a new ProbMinHash3 structure with nbhash functions and initial object initobj to fill signature.
nbhash must be greater or equal to 2. The precision on the final estimation depends on the number of hash functions.
The initial object can be any object , typically 0 for numerical objects.

pub fn hash_item(&mut self, id: D, weight: f64)[src]

Incrementally adds an item in hash signature. It can be used in streaming.
It is the building block of the computation, but this method does not check for unicity of id added in hash computation.
It is the user's responsability to enforce that. See function hash_weigthed_idxmap

pub fn get_signature(&self) -> &Vec<D>[src]

return final signature.

pub fn hash_wset<T>(&mut self, data: &mut T) where
    T: WeightedSet<Object = D> + Iterator<Item = D>, 
[src]

hash data when given by an iterable WeightedSet

pub fn hash_weigthed_idxmap<Hidx>(&mut self, data: &mut IndexMap<D, f64, Hidx>) where
    Hidx: BuildHasher
[src]

computes set signature when set is given as an IndexMap with weights corresponding to values.
This ensures that objects are assigned a weight only once, so that we really have a set of objects with weight associated.
The raw method hash_item can be used with the constraint that objects are sent ONCE in the hash method.

Auto Trait Implementations

impl<D, H> RefUnwindSafe for ProbMinHash3<D, H> where
    D: RefUnwindSafe,
    H: RefUnwindSafe

impl<D, H> Send for ProbMinHash3<D, H> where
    D: Send,
    H: Send

impl<D, H> Sync for ProbMinHash3<D, H> where
    D: Sync,
    H: Sync

impl<D, H> Unpin for ProbMinHash3<D, H> where
    D: Unpin,
    H: Unpin

impl<D, H> UnwindSafe for ProbMinHash3<D, H> where
    D: UnwindSafe,
    H: UnwindSafe

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

impl<V, T> VZip<V> for T where
    V: MultiLane<T>,