cht/
lib.rs

1// Copyright (C) 2021 Gregory Meyer
2//
3// This program is free software: you can redistribute it and/or modify
4// it under the terms of the GNU Affero General Public License as published by
5// the Free Software Foundation, either version 3 of the License, or
6// (at your option) any later version.
7//
8// This program is distributed in the hope that it will be useful,
9// but WITHOUT ANY WARRANTY; without even the implied warranty of
10// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
11// GNU Affero General Public License for more details.
12//
13// You should have received a copy of the GNU Affero General Public License
14// along with this program.  If not, see <https://www.gnu.org/licenses/>.
15
16//! Lockfree hash tables.
17//!
18//! The hash tables in this crate are, at their core, open addressing hash
19//! tables implemented using open addressing and boxed buckets. The core of
20//! these hash tables are bucket arrays, which consist of a vector of atomic
21//! pointers to buckets, an atomic pointer to the next bucket array, and an
22//! epoch number. In the context of this crate, an atomic pointer is a nullable
23//! pointer that is accessed and manipulated using atomic memory operations.
24//! Each bucket consists of a key and a possibly-uninitialized value.
25//!
26//! The key insight into making the hash table resizeable is to incrementally
27//! copy buckets from the old bucket array to the new bucket array. As buckets
28//! are copied between bucket arrays, their pointers in the old bucket array are
29//! CAS'd with a null pointer that has a sentinel bit set. If the CAS fails,
30//! that thread must read the bucket pointer again and retry copying it into the
31//! new bucket array. If at any time a thread reads a bucket pointer with the
32//! sentinel bit set, that thread knows that a new (larger) bucket array has
33//! been allocated. That thread will then immediately attempt to copy all
34//! buckets to the new bucket array. It is possible to implement an algorithm in
35//! which a subset of buckets are relocated per-thread; such an algorithm has
36//! not been implemented for the sake of simplicity.
37//!
38//! Bucket pointers that have been copied from an old bucket array into a new
39//! bucket array are marked with a borrowed bit. If a thread copies a bucket
40//! from an old bucket array into a new bucket array, fails to CAS the bucket
41//! pointer in the old bucket array, it attempts to CAS the bucket pointer in
42//! the new bucket array that it previously inserted to. If the bucket pointer
43//! in the new bucket array does *not* have the borrowed tag bit set, that
44//! thread knows that the value in the new bucket array was modified more
45//! recently than the value in the old bucket array. To avoid discarding updates
46//! to the new bucket array, a thread will never replace a bucket pointer that
47//! has the borrowed tag bit set with one that does not. To see why this is
48//! necessary, consider the case where a bucket pointer is copied into the new
49//! array, removed from the new array by a second thread, then copied into the
50//! new array again by a third thread.
51//!
52//! Mutating operations are, at their core, an atomic compare-and-swap (CAS) on
53//! a bucket pointer. Insertions CAS null pointers and bucket pointers with
54//! matching keys, modifications CAS bucket pointers with matching keys, and
55//! removals CAS non-tombstone bucket pointers. Tombstone bucket pointers are
56//! bucket pointers with a tombstone bit set as part of a removal; this
57//! indicates that the bucket's value has been moved from and will be destroyed
58//! if it has not beel already.
59//!
60//! As previously mentioned, removing an entry from the hash table results in
61//! that bucket pointer having a tombstone bit set. Insertions cannot
62//! displace a tombstone bucket unless their key compares equal, so once an
63//! entry is inserted into the hash table, the specific index it is assigned to
64//! will only ever hold entries whose keys compare equal. Without this
65//! restriction, resizing operations could result in the old and new bucket
66//! arrays being temporarily inconsistent. Consider the case where one thread,
67//! as part of a resizing operation, copies a bucket into a new bucket array
68//! while another thread removes and replaces that bucket from the old bucket
69//! array. If the new bucket has a non-matching key, what happens to the bucket
70//! that was just copied into the new bucket array?
71//!
72//! Tombstone bucket pointers are typically not copied into new bucket arrays.
73//! The exception is the case where a bucket pointer was copied to the new
74//! bucket array, then CAS on the old bucket array fails because that bucket has
75//! been replaced with a tombstone. In this case, the tombstone bucket pointer
76//! will be copied over to reflect the update without displacing a key from its
77//! bucket.
78//!
79//! This hash table algorithm was inspired by [a blog post by Jeff Phreshing]
80//! that describes the implementation of the Linear hash table in [Junction], a
81//! C++ library of concurrent data structrures. Additional inspiration was drawn
82//! from the lockfree hash table described by Cliff Click in [a tech talk] given
83//! at Google in 2007.
84//!
85//! [a blog post by Jeff Phreshing]: https://preshing.com/20160222/a-resizable-concurrent-map/
86//! [Junction]: https://github.com/preshing/junction
87//! [a tech talk]: https://youtu.be/HJ-719EGIts
88
89pub mod map;
90
91#[cfg(test)]
92#[macro_use]
93pub(crate) mod test_util;
94
95pub use map::HashMap;