pub struct CachedHash<T: Eq + Hash, BH: BuildHasher = BuildHasherDefault<DefaultHasher>> { /* private fields */ }Expand description
For a type T, CachedHash wraps T and implements Hash in a way that
caches T’s hash value. The first time the hash is computed, it is stored
and returned on subsequent calls. When the stored value is accessed mutably
the hash is invalidated and needs to be recomputed again. CachedHash implements
Deref and DerefMut so it can be used as a drop-in replacement for T.
In order for the hash to be invalidated correctly the stored type cannot use
interior mutability in a way that affects the hash. If this is the case, you
can use CachedHash::invalidate_hash to invalidate the hash manually.
By default the internal hash is computed using DefaultHasher. You can
change this by providing a custom Hasher to CachedHash::new_with_hasher or
even a custom BuildHasher to CachedHash::new_with_build_hasher. For most use
cases you should not need to do this.
Note that the hash of a value of type T and the same value wrapped in
CachedHash are generally different.
§Why is this useful?
Sometimes you have a type that is expensive to hash (for example because)
it is very large) but you need to store and move it between multiple
HashSets
In this case you can wrap the type in CachedHash to cache the hash value
only once.
However, when the type is modified often CachedHash loses its advantage
as the hash will get invalidated on every modification. CachedHash also
needs to store the u64 hash value which takes up some space.
You can run cargo bench to see some simple naive benchmarks comparing
a plain HashSet with a HashSet that stores values wrapped in CachedHash.
§Details
Whenever the hash is requested if it is not already computed it is computed
using the hasher provided by the BH BuildHasher and stored as an “internal
hash”. Note that this is not the same as the hash returned by the Hash implementation.
That implementation feeds the internal hash into the hasher provided to the hash
function. This means that the resulting hash is the hash of the hash of the stored
value.
However, there is one more issue. If we wanted to represent both the full range of hash values and the possibility of the hash not being computed yet, we would need 65 bits. In order to save space we need to reserve one value as a sentinel (this also lets us work with the stored “maybe-hash” atomically). This means that we need to artificially create a hash collision. Current implementation does this by changing the “internal hash” from 0 to 1 if it ends up being zero. This is generally not an issue. However, if you are using a custom hasher this might affect you.
This behaviour is not guaranteed and may change in the future. If this behaviour does not fit for your use case please open an issue.
Implementations§
Source§impl<T: Eq + Hash> CachedHash<T>
impl<T: Eq + Hash> CachedHash<T>
Sourcepub fn new(value: T) -> Self
pub fn new(value: T) -> Self
Creates a new CachedHash with the given value using DefaultHasher.
Note that the BuildHasher stored in the structure is a zero-sized type
that is both Send and Sync so it will not affect the Send and Sync
properties of CachedHash nor its size.
Source§impl<T: Eq + Hash, H: Hasher + Default> CachedHash<T, BuildHasherDefault<H>>
impl<T: Eq + Hash, H: Hasher + Default> CachedHash<T, BuildHasherDefault<H>>
Sourcepub fn new_with_hasher(value: T) -> Self
pub fn new_with_hasher(value: T) -> Self
Creates a new CachedHash with the given value using a provided hasher type implementing Default.
Note that the BuildHasher stored in the structure is a zero-sized type
that is both Send and Sync so it will not affect the Send and Sync
properties of CachedHash nor its size.
Source§impl<T: Eq + Hash, BH: BuildHasher> CachedHash<T, BH>
impl<T: Eq + Hash, BH: BuildHasher> CachedHash<T, BH>
Sourcepub const fn new_with_build_hasher(value: T, build_hasher: BH) -> Self
pub const fn new_with_build_hasher(value: T, build_hasher: BH) -> Self
Creates a new CachedHash with the given value and BuildHasher.
Note that build_hasher is stored in the structure and as such it can
cause the type to stop being Send and Sync if the hasher is not.
It can also increase the size of the structure.
Sourcepub fn invalidate_hash(this: &mut Self)
pub fn invalidate_hash(this: &mut Self)
Explicitly invalidates the cached hash. This should not be necessary in most cases as the hash will be automatically invalidated when the value is accessed mutably. However, if the value uses interior mutability in a way that affects the hash, you will need to call this function manually whenever the hash might have changed.
Sourcepub fn take_value(this: Self) -> T
pub fn take_value(this: Self) -> T
Destructs the CachedHash and returns the stored value.
Sourcepub const fn get(this: &Self) -> &T
pub const fn get(this: &Self) -> &T
Explicitly returns an immutable reference to the stored value.
Most of the time this will not be necessary as CachedHash
implements Deref so autoderef rules will automatically dereference
it to the stored value.
Sourcepub fn get_mut(this: &mut Self) -> &mut T
pub fn get_mut(this: &mut Self) -> &mut T
Explicitly returns a mutable reference to the stored value and invalidates the cached hash.
Most of the time this will not be necessary as CachedHash implements
DerefMut so autoderef rules will automatically dereference it to the
stored value. (Such dereference still invalidates the stored hash.)