ph
is the Rust library (by Piotr Beling) of data structures based on perfect hashing.
The library contains an implementation of two variants of the fingerprint-based minimal perfect hash function:
without (FMPH, [fmph::Function
]) and with (FMPHGO, [fmph::GOFunction
]) group optimization.
A minimal perfect hash function (MPHF) is a bijection from a key set K to the set {0, 1, ..., |K|−1}.
FMPH and FMPHGO can be constructed for any set K (given in advance) of hashable items and represented using about 2.8 and 2.1 bits per key (regardless of key types), respectively. FMPH and FMPHGO are fast (O(1) in expectation) to evaluate. Their construction requires very little auxiliary space, takes a short (O(|K|) in expectation) time (which is especially true for FMPH) and, in addition, can be parallelized or carried out without holding keys in memory.
Bibliography
When using ph
for research purposes, please cite the following paper which provides details on FMPH and FMPHGO:
- Piotr Beling, Fingerprinting-based minimal perfect hashing revisited, ACM Journal of Experimental Algorithmics, 2023, https://doi.org/10.1145/3596453
Examples
The following examples illustrate the use of [fmph::Function
], which, however, can be replaced with [fmph::GOFunction
] without any other changes.
A basic example:
use fmph;
let keys = ;
let f = from;
// f assigns each key a unique number from the set {0, 1, 2}
for k in keys
let mut values = ;
values.sort;
assert_eq!;
An example of using [fmph::Function
] and bitmap to represent subsets of a given set of hashable elements:
use fmph;
use ; // bitm is used to manipulate bitmaps
use Hash;
let mut subset = of;
assert_eq!;
assert!;
assert!;
subset.insert;
subset.insert;
assert_eq!;
assert!;
subset.remove;
assert_eq!;
assert!;
// subset.insert(&"zeta"); // may either panic or insert any item into subset
Above Subset
is an example of an updatable retrieval data structure with a 1-bit payload.
It can be generalized by replacing the bitmap with a vector of other payload.