1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
//! Lock-free hash tables.
//!
//! moka-cht provides a lock-free hash table called [`HashMap`][hm-struct] that
//! supports fully concurrent lookups, insertions, modifications, and deletions.
//! The table may also be concurrently resized to allow more elements to be
//! inserted.
//!
//! moka-cht also provides the [`SegmentedHashMap`][shm-struct] using the same
//! lock-free algorithm for increased concurrent write performance.
//!
//! [hm-struct]: ./map/struct.HashMap.html
//! [shm-struct]: ./segment/map/struct.HashMap.html
//!
//!
//! ## Implementation Details
//!
//! The hash tables in this crate are, at their core, open addressing hash tables
//! implemented using open addressing and boxed buckets. The core of these hash
//! tables are bucket arrays, which consist of a vector of atomic pointers to
//! buckets, an atomic pointer to the next bucket array, and an epoch number. In the
//! context of this crate, an atomic pointer is a nullable pointer that is accessed
//! and manipulated using atomic memory operations. Each bucket consists of a key and
//! a possibly-uninitialized value.
//!
//! The key insight into making the hash table resizable is to incrementally copy
//! buckets from the old bucket array to the new bucket array. As buckets are copied
//! between bucket arrays, their pointers in the old bucket array are mutated by
//! using an atomic compare-and-swap (CAS) operation to have a null pointer with a
//! sentinel bit set. If the CAS fails, that thread must read the bucket pointer
//! again and retry copying it into the new bucket array.
//!
//! If at any time a thread reads a bucket pointer with the sentinel bit set, that
//! thread knows that a new (larger) bucket array has been allocated. That thread
//! will then immediately attempt to copy all buckets to the new bucket array. It is
//! possible to implement an algorithm in which a subset of buckets are relocated
//! per-thread; such an algorithm has not been implemented for the sake of
//! simplicity.
//!
//! Bucket pointers that have been copied from an old bucket array into a new bucket
//! array are marked with a borrowed bit. If a thread copies a bucket from an old
//! bucket array into a new bucket array, fails to CAS the bucket pointer in the old
//! bucket array, it attempts to CAS the bucket pointer in the new bucket array that
//! it previously inserted to. If the bucket pointer in the new bucket array does
//! *not* have the borrowed tag bit set, that thread knows that the value in the new
//! bucket array was modified more recently than the value in the old bucket
//! array. To avoid discarding updates to the new bucket array, a thread will never
//! replace a bucket pointer that has the borrowed tag bit set with one that does
//! not. To see why this is necessary, consider the case where a bucket pointer is
//! copied into the new array, removed from the new array by a second thread, then
//! copied into the new array again by a third thread.
//!
//! Mutating operations are, at their core, an atomic compare-and-swap (CAS) on a
//! bucket pointer. Insertions CAS null pointers and bucket pointers with matching
//! keys, modifications CAS bucket pointers with matching keys, and removals CAS
//! non-tombstone bucket pointers. Tombstone bucket pointers are bucket pointers with
//! a tombstone bit set as part of a removal; this indicates that the bucket's value
//! has been moved from and will be destroyed if it has not been already.
//!
//! As previously mentioned, removing an entry from the hash table results in that
//! bucket pointer having a tombstone bit set. Insertions cannot displace a tombstone
//! bucket unless their key compares equal, so once an entry is inserted into the
//! hash table, the specific index it is assigned to will only ever hold entries
//! whose keys compare equal. Without this restriction, resizing operations could
//! result in the old and new bucket arrays being temporarily inconsistent. Consider
//! the case where one thread, as part of a resizing operation, copies a bucket into
//! a new bucket array while another thread removes and replaces that bucket from the
//! old bucket array. If the new bucket has a non-matching key, what happens to the
//! bucket that was just copied into the new bucket array?
//!
//! Tombstone bucket pointers are typically not copied into new bucket arrays.
//! The exception is the case where a bucket pointer was copied to the new bucket
//! array, then CAS on the old bucket array fails because that bucket has been
//! replaced with a tombstone. In this case, the tombstone bucket pointer will be
//! copied over to reflect the update without displacing a key from its bucket.
//!
//! This hash table algorithm was inspired by [a blog post by Jeff Preshing]
//! that describes the implementation of the Linear hash table in [Junction], a C++
//! library of concurrent data structures. Additional inspiration was drawn from the
//! lock-free hash table described by Cliff Click in [a tech talk] given at Google
//! in 2007.
//!
//! [a blog post by Jeff Preshing]: https://preshing.com/20160222/a-resizable-concurrent-map/
//! [Junction]: https://github.com/preshing/junction
//! [a tech talk]: https://youtu.be/HJ-719EGIts

pub mod map;
pub mod segment;

#[cfg(test)]
#[macro_use]
pub(crate) mod test_util;

pub use map::HashMap;
pub use segment::HashMap as SegmentedHashMap;