[][src]Struct probminhash::superminhasher::SuperMinHash

pub struct SuperMinHash<'a, T: Hash, H: 'a + Hasher + Default> { /* fields omitted */ }

An implementation of Superminhash A new minwise Hashing Algorithm for Jaccard Similarity Estimation Otmar Ertl 2017-2018 arXiv https://arxiv.org/abs/1706.05698

The hash strategy can be chosen by specializing the H type to Fnv (fast when hashing small values as integer according to documentation), of fxhash or any hasher chosen by the user.
The hash values can also be computed before entering SuperMinHash methods so that the structure just the specific minhash part of the algorithm. In this second case, the build_hasher should be parametrized by NoHashHasher (as in finch module).

It runs in one pass on data so it can be used in streaming

Implementations

impl<'a, T: Hash, H: 'a + Hasher + Default> SuperMinHash<'a, T, H>[src]

pub fn new(
    size: usize,
    build_hasher: &'a BuildHasherDefault<H>
) -> SuperMinHash<'a, T, H>
[src]

allocate a struct to do superminhash. size is size of sketch. build_hasher is the build hasher for the type of Hasher we want.

pub fn reinit(&mut self)[src]

Reinitialize minhasher, keeping size of sketches.
SuperMinHash can then be reinitialized and used again with sketch_slice. This methods puts an end to a streaming sketching of data and resets all counters.

pub fn get_hsketch(&self) -> &Vec<f64>[src]

returns a reference to computed sketches

pub fn get_jaccard_index_estimate(
    &self,
    other_sketch: &Vec<f64>
) -> Result<f64, ()>
[src]

returns an estimator of jaccard index between the sketch in this structure and the sketch passed as arg

pub fn sketch(&mut self, to_sketch: &T) -> Result<(), ()>[src]

Insert an item in the set to sketch.
It can be used in streaming to update current sketch

pub fn sketch_slice(&mut self, to_sketch: &[T]) -> Result<(), ()>[src]

Arg to_sketch is an array ( a slice) of values to hash. It can be used in streaming to update current sketch

Auto Trait Implementations

impl<'a, T, H> RefUnwindSafe for SuperMinHash<'a, T, H> where
    H: RefUnwindSafe,
    T: RefUnwindSafe

impl<'a, T, H> Send for SuperMinHash<'a, T, H> where
    H: Sync,
    T: Send

impl<'a, T, H> Sync for SuperMinHash<'a, T, H> where
    H: Sync,
    T: Sync

impl<'a, T, H> Unpin for SuperMinHash<'a, T, H> where
    T: Unpin

impl<'a, T, H> UnwindSafe for SuperMinHash<'a, T, H> where
    H: RefUnwindSafe,
    T: UnwindSafe

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

impl<V, T> VZip<V> for T where
    V: MultiLane<T>,