1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
//! A concurrent hash table which uses leapfrog probing. This map is also lock-free
//! **if** the key and value support atomic operations. If not, an efficient spin-lock
//! is used instead.
//!
//! All operations on a [`LeapMap`] can be used concurrently with each other, and
//! the map is both fast and scalable up to many threads. The interface for the map
//! has been kept as close as possible to `std::collections::HashMap`, however,
//! there are some differences to ensure that concurrent operations are always correct
//! without decreasing performance.
//!
//! The biggest differences between the interface of the [`LeapMap`] and the
//! `std::collections::HashMap` are the types returned by `get` and `get_mut`.
//! For the [`LeapMap`], these methods return [`Ref`] and [`RefMut`] types,
//! respectively, whose interfaces are designed to allow for accessing the cells
//! returned by the those methods concurrently and correctly. As a result, it is
//! not possible to get a reference to a cell's value, since the cell value type
//! is atomic. The cell's key, value, and key-value pairs can be accessed with
//! `key`, `value`, and `key_value` methods, respectively for [`Ref`] and
//! [`RefMut`] types. For the [`RefMut`] type, the value can additionally be
//! modified with `set_value` or `update`. These interfaces ensure that the most
//! up to date value is loaded/stored/updated, and that if the referenced cell is
//! invalidated/updated/erased by other threads, the [`Ref`]/[`RefMut`] type is
//! updated appropriately. This makes working concurrently with the map simple,
//! without sacrificing performance.
//!
//! The map uses [leapfrog probing](https://preshing.com/20160314/leapfrog-probing/)
//! which is similar to [hopscotch hashing](https://en.wikipedia.org/wiki/Hopscotch_hashing)
//! since it is based off of it. Leapfrog probing stores two offset per
//! cell which are used to traverse a local neighbourhood of values
//! around the cell to which a key hashes. This makes the map operations cache
//! friendly and scalable, even under heavy contention. Keys are also stored,
//! and to allow the [`LeapMap`] to still be efficient the hash of a key is also
//! stored, which adds 8 bytes of overhead per key-value pair. This can very likely
//! be improved, but this map is still in its early stages. The [`LeapMap`] is
//! designed for performance so if memory efficiency is more important, a different
//! map is likely a better choice.
//!
//! # HashMap
//!
//! This crate also provides [`HashMap`], which is an fast single-threaded
//! version of the concurrent lock-free hash map, whose performance is roughly
//! 1.2-1.5x the hash map implementation in the standard library, when using the
//! same hash function. The tradeoff that is currently made to enable this performance
//! is increased memory use, since the map stores the offsets (8 bytes) and the
//! hashed value for a key (8 bytes). This may make the map unsuitable for some
//! use cases.
//!
//! # Hash Functions
//!
//! By default, the default hash function from the standard library is used for
//! both the [`HashMap`] and the [`LeapMap`], since it is DOS resistant. It is,
//! however, more expensive to evaluate. The [`MurmurHasher`] can be used instead,
//! which is quite a bit faster but which is **not** DOS resistant. Additionally, the
//! default initialization of the [`HashMap`] and [`LeapMap`] data requires that
//! 0 not be used as a key for the map *if the [`MurmurHasher`] is used*. With the
//! default hasher this limitation does not apply. Any other hash functions can
//! be used by creating the maps with `with_hasher` or `with_capacity_and_hasher`.
//!
//! # Performance
//!
//! This map has been benchmarked against other hash maps across multiple threads
//! and multiple workloads. The benchmarks can be found [here](https://github.com/robclu/conc-map-bench).
//! A snapshot of the results for a read heavy workload are the following (with
//! throughput in Mops/second, cores in brackets, and performance realtive to
//! `std::collections::HashMap` with RwLock). While these benchmarks show good
//! performance, **please take note of the limitations described above**, and
//! benchmark for your use case. Please also create an issue if there are either
//! correctness or performance problems for a specific use case. For a read heavy
//! workload on 16 cores the following was benchmarked:
//!
//! | Map | Throughput (1) | Relative (1) | Throughput (16) | Relative (16) |
//! |------------------|----------------|--------------|-----------------|---------------|
//! | RwLock + HashMap | 19.4 | 1.0 | 11.7 | 0.60 |
//! | Flurry | 11.2 | 0.58 | 80.8 | 4.16 |
//! | DashMap | 14.1 | 0.72 | 87.5 | 4.51 |
//! | LeapMap | 17.8 | 0.92 | 148.0 | 7.62 |
//!
//! On an 12 core M2 Max, the results were the following:
//!
//! | Map | Throughput (1) | Relative (1) | Throughput (12) | Relative (12) |
//! |------------------|----------------|--------------|-----------------|---------------|
//! | RwLock + HashMap | 21.2 | 1.0 | 3.4 | 0.16 |
//! | Flurry | 8.0 | 0.37 | 64.5 | 3.04 |
//! | DashMap | 10.4 | 0.49 | 42.2 | 2.00 |
//! | Evmap | 13.5 | 0.64 | 18.7 | 0.88 |
//! | LeapMap | 13.9 | 0.66 | 106.1 | 5.00 |
//!
//! On an exchange heavy benchmark, the leapmap was even faster.
//!
//! The performance of the [`LeapMap`] is limited is when rapidly growing the map, since
//! the bottleneck then becomes the resizing (allocation) and migration operations.
//! The map **is not** designed to be resized often (resizing infrequently has very
//! little impact on performance, however), so it's best to use an initial capacity
//! which is close to the expected maximum number of elements for the map. The growing
//! performace is slightly worse than DashMap's growing performance (see the
//! benchmarks). If resizing is frequent, this map is currently not the best choice,
//! however, this can be improved by using a more efficient allocator.
//!
//! # Consistency
//!
//! All operations on a [`LeapMap`] map are non-blocking, and accessing/updating
//! a value in the map will not lock **if the value type has built-in atomic support**.
//! The map can be iterated (mutable or immutably) from multiple threads while
//! concurrently calling the operations directly on the map from other threads.
//!
//! # Limitations
//!
//! Getting the length of the map does not return the length of the map, but rather an
//! estimate of the length, unless the calling thread is the only thread operating
//! on the map, and therefore there are no other readers or writers. Additionally,
//! the length is expensive to compute since it is not tracked as insertions and
//! removals are performed. The overhead of doing so when under contention
//! was found to be significant during benchmarking, and given that it's only an
//! estimate, it's not worth the cost.
//!
//! The same applies for `is_empty`.
//!
//! The size of the map must always be a power of two. This is handled internally
//! and may change in the future to improve memory efficiency.
//!
//! The value type for the map needs to implement the [`Value`] trait, which is
//! simple enough to implement, however, two values need to be chosen: one as a
//! null value and another as a redirect value. For integer types MAX and MAX-1
//! are used respectively. A good choice is values which are efficient for
//! comparison.
//!
//! # Resizing
//!
//! The buckets for the map are expanded when an insertion would result in a
//! offset which would overflow the maximum probe offset (i.e when it's not
//! possible to find an empty cell within the probe neighbourhood), or shrunk
//! when probes are too far apart. The resizing and subsequent removal of buckets
//! from the old table to the new table will happen concurrently when multiple
//! threads are attempting to modify the map, which makes the reszing
//! efficient. Despite this, it is best to choose a larger size where possible,
//! since it will ensure more stable performance of map operations. When the table
//! is resized, the old memory is cleanup up using what is essentially a quescient
//! based reclaimation strategy.
//!
//! # Features
//!
//! There is optional support for serde, via the "serde" feature.
//!
//! # Usage with stable
//!
//! This crate requires the `allocator_api` feature, which is only available on nightly.
//! To enable use of the crate with the stable toolchain, the `"stable_alloc"` feature has
//! been added.
//!
//! If/when the `allocator_api` feature is no longer experimental, this feature flag will
//! be removed.
use crateload_u64_le;
use ;
pub use HashMap;
pub use LeapMap;
pub use Ref;
pub use RefMut;
/// Trait which represents a value which can be stored in a map.
value_impl!;
value_impl!;
value_impl!;
value_impl!;
value_impl!;
value_impl!;
value_impl!;
value_impl!;
value_impl!;
/// Creates a hash value from the `hash_builder` and `value`.
pub
/// Implementation of a hasher which hashes using murmur.
;
/// Implementaion of hasher which hashes using FNV (Fowler-Noll-Vo).
;
/// This is not really a hasher, but more of a mock hasher, as it just returns the value
/// to be hashed. However, if it's known that the key is unique, it might be useful in
/// such a scenario.
;