[−][src]Struct probminhash::probminhasher::ProbMinHash3
implementation of the algorithm ProbMinHash3a as described in Etrl.
It needs less memory than Probminhash3 but can be a little slower.
Probminhash3 needs at least 2 hash values to run.
The algorithms requires random generators to be initialized by objects hashed.
So it must possible to associate D (at least partially) injectively to a u64 for random generator initialization hence the requirement D:H.
If all data are referred to by an unsigned integer, and weight association is given in a tuple for example
data comes in a Vec<(D,f64)> then D is in fact can be replaced by the rank in the Vector, the no hash is need and you can use NoHasher
Implementations
impl<D, H> ProbMinHash3<D, H> where
D: Copy + Eq + Debug + Hash,
H: Hasher + Default,
[src]
D: Copy + Eq + Debug + Hash,
H: Hasher + Default,
pub fn new(nbhash: usize, initobj: D) -> Self
[src]
Allocates a new ProbMinHash3 structure with nbhash functions and initial object initobj to fill signature.
nbhash must be greater or equal to 2.
The precision on the final estimation depends on the number of hash functions.
The initial object can be any object , typically 0 for numerical objects.
pub fn hash_item(&mut self, id: D, weight: f64)
[src]
Incrementally adds an item in hash signature. It can be used in streaming.
It is the building block of the computation, but this method
does not check for unicity of id added in hash computation.
It is the user's responsability to enforce that. See function hash_weigthed_idxmap
pub fn get_signature(&self) -> &Vec<D>
[src]
return final signature.
pub fn hash_wset<T>(&mut self, data: &mut T) where
T: WeightedSet<Object = D> + Iterator<Item = D>,
[src]
T: WeightedSet<Object = D> + Iterator<Item = D>,
hash data when given by an iterable WeightedSet
pub fn hash_weigthed_idxmap<Hidx>(&mut self, data: &mut IndexMap<D, f64, Hidx>) where
Hidx: BuildHasher,
[src]
Hidx: BuildHasher,
computes set signature when set is given as an IndexMap with weights corresponding to values.
This ensures that objects are assigned a weight only once, so that we really have a set of objects with weight associated.
The raw method hash_item can be used with the constraint that objects are sent ONCE in the hash method.
Auto Trait Implementations
impl<D, H> RefUnwindSafe for ProbMinHash3<D, H> where
D: RefUnwindSafe,
H: RefUnwindSafe,
D: RefUnwindSafe,
H: RefUnwindSafe,
impl<D, H> Send for ProbMinHash3<D, H> where
D: Send,
H: Send,
D: Send,
H: Send,
impl<D, H> Sync for ProbMinHash3<D, H> where
D: Sync,
H: Sync,
D: Sync,
H: Sync,
impl<D, H> Unpin for ProbMinHash3<D, H> where
D: Unpin,
H: Unpin,
D: Unpin,
H: Unpin,
impl<D, H> UnwindSafe for ProbMinHash3<D, H> where
D: UnwindSafe,
H: UnwindSafe,
D: UnwindSafe,
H: UnwindSafe,
Blanket Implementations
impl<T> Any for T where
T: 'static + ?Sized,
[src]
T: 'static + ?Sized,
impl<T> Borrow<T> for T where
T: ?Sized,
[src]
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
[src]
T: ?Sized,
pub fn borrow_mut(&mut self) -> &mut T
[src]
impl<T> From<T> for T
[src]
impl<T, U> Into<U> for T where
U: From<T>,
[src]
U: From<T>,
impl<T, U> TryFrom<U> for T where
U: Into<T>,
[src]
U: Into<T>,
type Error = Infallible
The type returned in the event of a conversion error.
pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>
[src]
impl<T, U> TryInto<U> for T where
U: TryFrom<T>,
[src]
U: TryFrom<T>,
type Error = <U as TryFrom<T>>::Error
The type returned in the event of a conversion error.
pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>
[src]
impl<V, T> VZip<V> for T where
V: MultiLane<T>,
V: MultiLane<T>,