1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
//! Hashing infrastructure shared by every structure.
//!
//! Probabilistic data structures live and die by the quality of their hash
//! function: poor mixing inflates the false-positive rate and skews cardinality
//! estimates. This module supplies a fast, high-quality, **deterministic**
//! default hasher and the machinery to derive the multiple index hashes each
//! structure needs from a single pass over the input.
//!
//! # Choosing a hasher
//!
//! Every structure is generic over [`core::hash::BuildHasher`] and defaults to
//! [`DefaultHashBuilder`]. The default is seeded with a fixed constant, which
//! makes hashing reproducible: two filters built with the default hasher are
//! always mergeable, and serialized state round-trips byte-for-byte across
//! runs. The trade-off is that a fixed seed offers no defence against an
//! adversary who crafts inputs to collide on purpose. When the inputs are
//! untrusted, substitute a randomly-seeded [`BuildHasher`] such as
//! [`std::collections::hash_map::RandomState`]:
//!
//! ```
//! # #[cfg(feature = "std")] {
//! use std::collections::hash_map::RandomState;
//! use bloom_lib::BloomFilter;
//!
//! let filter: BloomFilter<&str, RandomState> =
//! BloomFilter::with_hasher(1_000, 0.01, RandomState::new()).unwrap();
//! # }
//! ```
use ;
/// Fixed seed for [`DefaultHasher`]. Chosen for a good bit distribution; the
/// value itself is arbitrary but must stay constant for reproducibility.
const SEED: u64 = 0x2545_F491_4F6C_DD1D;
/// Odd multiplier used by the block mixer. Large, odd, and high-entropy.
const MIX_A: u64 = 0xFF51_AFD7_ED55_8CCD;
/// Second multiplier, applied in the finalizer to decorrelate the result.
const MIX_B: u64 = 0xC4CE_B9FE_1A85_EC53;
/// 64x64 -> 64 multiply-fold, the core mixing step (the "wymum" primitive).
///
/// Multiplying two 64-bit words yields a 128-bit product whose high and low
/// halves are then folded together with XOR. This propagates every input bit
/// across the whole output and is the engine behind several modern hashes.
/// A fast, non-cryptographic 64-bit hasher with strong avalanche behaviour.
///
/// `DefaultHasher` folds the input eight bytes at a time through a 64x64-bit
/// multiply-fold and mixes the total length into the finalizer, so inputs that
/// differ only by trailing zero bytes do not collide. It is **not** suitable for
/// authentication or any setting where collision resistance against an
/// adversary is required; use a cryptographic hash for that.
///
/// You rarely construct this directly. It is produced by
/// [`DefaultHashBuilder`], which the structures use by default.
///
/// # Examples
///
/// ```
/// use core::hash::{Hash, Hasher};
/// use bloom_lib::hash::DefaultHasher;
///
/// let mut hasher = DefaultHasher::default();
/// "probabilistic".hash(&mut hasher);
/// let digest = hasher.finish();
///
/// // Hashing the same value again yields the same digest.
/// let mut again = DefaultHasher::default();
/// "probabilistic".hash(&mut again);
/// assert_eq!(digest, again.finish());
/// ```
/// A [`BuildHasher`] that yields [`DefaultHasher`] instances seeded with a fixed
/// constant.
///
/// This is the default hasher for every structure in the crate. Because the
/// seed is fixed, the resulting hashing is deterministic: ideal for
/// reproducible builds, mergeable filters, and stable serialized state. It is a
/// zero-sized type, so embedding it in a structure costs nothing.
///
/// # Examples
///
/// ```
/// use core::hash::BuildHasher;
/// use bloom_lib::hash::DefaultHashBuilder;
///
/// let builder = DefaultHashBuilder::default();
/// let h1 = builder.hash_one("key");
/// let h2 = builder.hash_one("key");
/// assert_eq!(h1, h2);
/// ```
;
/// A pair of independent 64-bit hashes used to synthesise an arbitrary number
/// of index hashes via the Kirsch-Mitzenmacher double-hashing scheme.
///
/// The structures need `k` distinct hash positions per item but hash the item
/// only once. From the base digest `h1` and a decorrelated companion `h2`, the
/// `i`-th position is `h1 + i * h2`. This yields `k` well-distributed hashes at
/// the cost of a single hashing pass.
pub
/// Maps a 64-bit hash uniformly into `[0, range)` without a division.
///
/// This is Lemire's multiply-shift reduction: it takes the high 64 bits of
/// `hash * range`, which is both faster than `hash % range` and free of the
/// modulo bias that `%` introduces when `range` is not a power of two.
pub
/// A strong 64-bit avalanche mix, used to derive a secondary hash from a small
/// value (such as a Cuckoo fingerprint) where the input has little entropy.
pub