Struct boomphf_patched::Mphf
source · pub struct Mphf<T> { /* private fields */ }
Expand description
A minimal perfect hash function over a set of objects of type T
.
Implementations§
source§impl<'a, T: 'a + Hash + Debug> Mphf<T>
impl<'a, T: 'a + Hash + Debug> Mphf<T>
sourcepub fn from_chunked_iterator<I, N>(
gamma: f64,
objects: &'a I,
n: usize
) -> Mphf<T>where
&'a I: IntoIterator<Item = N>,
N: IntoIterator<Item = T> + Send,
<N as IntoIterator>::IntoIter: ExactSizeIterator,
<&'a I as IntoIterator>::IntoIter: Send,
I: Sync,
pub fn from_chunked_iterator<I, N>(
gamma: f64,
objects: &'a I,
n: usize
) -> Mphf<T>where
&'a I: IntoIterator<Item = N>,
N: IntoIterator<Item = T> + Send,
<N as IntoIterator>::IntoIter: ExactSizeIterator,
<&'a I as IntoIterator>::IntoIter: Send,
I: Sync,
Constructs an MPHF from a (possibly lazy) iterator over iterators.
This allows construction of very large MPHFs without holding all the keys
in memory simultaneously.
objects
is an IntoInterator
yielding a stream of IntoIterator
s that must not contain any duplicate items.
objects
must be able to be iterated over multiple times and yield the same stream of items each time.
gamma
controls the tradeoff between the construction-time and run-time speed,
and the size of the datastructure representing the hash function. See the paper for details.
max_iters
- None to never stop trying to find a perfect hash (safe if no duplicates).
NOTE: the inner iterator N::IntoIter
should override nth
if there’s an efficient way to skip
over items when iterating. This is important because later iterations of the MPHF construction algorithm
skip most of the items.
source§impl<T: Hash + Debug> Mphf<T>
impl<T: Hash + Debug> Mphf<T>
sourcepub fn new(gamma: f64, objects: &[T]) -> Mphf<T>
pub fn new(gamma: f64, objects: &[T]) -> Mphf<T>
Generate a minimal perfect hash function for the set of objects
.
objects
must not contain any duplicate items.
gamma
controls the tradeoff between the construction-time and run-time speed,
and the size of the datastructure representing the hash function. See the paper for details.
max_iters
- None to never stop trying to find a perfect hash (safe if no duplicates).
sourcepub fn hash(&self, item: &T) -> u64
pub fn hash(&self, item: &T) -> u64
Compute the hash value of item
. This method should only be used
with items known to be in construction set. Use try_hash
if you cannot
guarantee that item
was in the construction set. If item
was not present
in the construction set this function may panic.
sourcepub fn try_hash<Q>(&self, item: &Q) -> Option<u64>where
T: Borrow<Q>,
Q: ?Sized + Hash,
pub fn try_hash<Q>(&self, item: &Q) -> Option<u64>where
T: Borrow<Q>,
Q: ?Sized + Hash,
Compute the hash value of item
. If item
was not present
in the set of objects used to construct the hash function, the return
value will an arbitrary value Some(x), or None.
pub fn try_hash_bench<Q>(&self, item: &Q, level: &mut u64) -> Option<u64>where
T: Borrow<Q>,
Q: ?Sized + Hash,
source§impl<'a, T: 'a + Hash + Debug + Send + Sync> Mphf<T>
impl<'a, T: 'a + Hash + Debug + Send + Sync> Mphf<T>
sourcepub fn from_chunked_iterator_parallel<I, N>(
gamma: f64,
objects: &'a I,
max_iters: Option<u64>,
n: usize,
num_threads: usize
) -> Mphf<T>where
&'a I: IntoIterator<Item = N>,
N: IntoIterator<Item = T> + Send + Clone,
<N as IntoIterator>::IntoIter: ExactSizeIterator,
<&'a I as IntoIterator>::IntoIter: Send,
I: Sync,
pub fn from_chunked_iterator_parallel<I, N>(
gamma: f64,
objects: &'a I,
max_iters: Option<u64>,
n: usize,
num_threads: usize
) -> Mphf<T>where
&'a I: IntoIterator<Item = N>,
N: IntoIterator<Item = T> + Send + Clone,
<N as IntoIterator>::IntoIter: ExactSizeIterator,
<&'a I as IntoIterator>::IntoIter: Send,
I: Sync,
Same as to from_chunked_iterator
but parallelizes work over num_threads
threads.
Trait Implementations§
source§impl<T> GetSize for Mphf<T>
impl<T> GetSize for Mphf<T>
source§fn size_bytes_dyn(&self) -> usize
fn size_bytes_dyn(&self) -> usize
self
.
Same as self.size_bytes() - std::mem::size_of_val(self)
. Read moresource§const USES_DYN_MEM: bool = true
const USES_DYN_MEM: bool = true
true
if and only if the variables of this type can use dynamic (heap) memory.source§fn size_bytes_content_dyn(&self) -> usize
fn size_bytes_content_dyn(&self) -> usize
self
content.
It usually equals to size_bytes_dyn()
.
However, sometimes it is smaller by the amount of memory reserved but not yet used
(e.g., size_bytes_content_dyn()
only takes into account the length of the vector and not its capacity). Read moresource§fn size_bytes(&self) -> usize
fn size_bytes(&self) -> usize
self
.