pub struct Mphf<T> { /* private fields */ }
Expand description
A minimal perfect hash function over a set of objects of type T
.
Implementations§
source§impl<'a, T: 'a + Hash + Debug> Mphf<T>
impl<'a, T: 'a + Hash + Debug> Mphf<T>
sourcepub fn from_chunked_iterator<I, N>(
gamma: f64,
objects: &'a I,
n: u64
) -> Mphf<T>where
&'a I: IntoIterator<Item = N>,
N: IntoIterator<Item = T> + Send,
<N as IntoIterator>::IntoIter: ExactSizeIterator,
<&'a I as IntoIterator>::IntoIter: Send,
I: Sync,
pub fn from_chunked_iterator<I, N>( gamma: f64, objects: &'a I, n: u64 ) -> Mphf<T>where &'a I: IntoIterator<Item = N>, N: IntoIterator<Item = T> + Send, <N as IntoIterator>::IntoIter: ExactSizeIterator, <&'a I as IntoIterator>::IntoIter: Send, I: Sync,
Constructs an MPHF from a (possibly lazy) iterator over iterators.
This allows construction of very large MPHFs without holding all the keys
in memory simultaneously.
objects
is an IntoInterator
yielding a stream of IntoIterator
s that must not contain any duplicate items.
objects
must be able to be iterated over multiple times and yield the same stream of items each time.
gamma
controls the tradeoff between the construction-time and run-time speed,
and the size of the datastructure representing the hash function. See the paper for details.
n
is the total number of items that will be produced by iterating over all the input iterators.
NOTE: the inner iterator N::IntoIter
should override nth
if there’s an efficient way to skip
over items when iterating. This is important because later iterations of the MPHF construction algorithm
skip most of the items.
source§impl<T: Hash + Debug> Mphf<T>
impl<T: Hash + Debug> Mphf<T>
sourcepub fn new(gamma: f64, objects: &[T]) -> Mphf<T>
pub fn new(gamma: f64, objects: &[T]) -> Mphf<T>
Generate a minimal perfect hash function for the set of objects
.
objects
must not contain any duplicate items.
gamma
controls the tradeoff between the construction-time and run-time speed,
and the size of the datastructure representing the hash function. See the paper for details.
max_iters
- None to never stop trying to find a perfect hash (safe if no duplicates).
source§impl<'a, T: 'a + Hash + Debug + Send + Sync> Mphf<T>
impl<'a, T: 'a + Hash + Debug + Send + Sync> Mphf<T>
sourcepub fn from_chunked_iterator_parallel<I, N>(
gamma: f64,
objects: &'a I,
max_iters: Option<u64>,
n: u64,
num_threads: usize
) -> Mphf<T>where
&'a I: IntoIterator<Item = N>,
N: IntoIterator<Item = T> + Send + Clone,
<N as IntoIterator>::IntoIter: ExactSizeIterator,
<&'a I as IntoIterator>::IntoIter: Send,
I: Sync,
pub fn from_chunked_iterator_parallel<I, N>( gamma: f64, objects: &'a I, max_iters: Option<u64>, n: u64, num_threads: usize ) -> Mphf<T>where &'a I: IntoIterator<Item = N>, N: IntoIterator<Item = T> + Send + Clone, <N as IntoIterator>::IntoIter: ExactSizeIterator, <&'a I as IntoIterator>::IntoIter: Send, I: Sync,
Same as to from_chunked_iterator
but parallelizes work over num_threads
threads.