left_right/lib.rs
1//! A concurrency primitive for high concurrency reads over a single-writer data structure.
2//!
3//! The primitive keeps two copies of the backing data structure, one that is accessed by readers,
4//! and one that is accessed by the (single) writer. This enables all reads to proceed in parallel
5//! with minimal coordination, and shifts the coordination overhead to the writer. In the absence
6//! of writes, reads scale linearly with the number of cores.
7//!
8//! When the writer wishes to expose new changes to the datastructure (see
9//! [`WriteHandle::publish`]), it "flips" the two copies so that subsequent reads go to the old
10//! "write side", and future writers go to the old "read side". This process does cause two cache
11//! line invalidations for the readers, but does not stop them from making progress (i.e., reads
12//! are wait-free).
13//!
14//! In order to keep both copies up to date, left-right keeps an operational log ("oplog") of all
15//! the modifications to the data structure, which it uses to bring the old read data up to date
16//! with the latest writes on a flip. Since there are two copies of the data, each oplog entry is
17//! applied twice: once to the write copy and again to the (stale) read copy.
18//!
19//! # Trade-offs
20//!
21//! Few concurrency wins come for free, and this one is no exception. The drawbacks of this
22//! primitive are:
23//!
24//! - **Increased memory use**: since we keep two copies of the backing data structure, we are
25//! effectively doubling the memory use of the underlying data. With some clever de-duplication,
26//! this cost can be ameliorated to some degree, but it's something to be aware of. Furthermore,
27//! if writers only call `publish` infrequently despite adding many writes to the operational
28//! log, the operational log itself may grow quite large, which adds additional overhead.
29//! - **Deterministic operations**: as the entries in the operational log are applied twice, once
30//! to each copy of the data, it is essential that the operations are deterministic. If they are
31//! not, the two copies will no longer mirror one another, and will continue to diverge over
32//! time.
33//! - **Single writer**: left-right only supports a single writer. To have multiple writers, you
34//! need to ensure exclusive access to the [`WriteHandle`] through something like a
35//! [`Mutex`](std::sync::Mutex).
36//! - **Slow writes**: Writes through left-right are slower than they would be directly against
37//! the backing datastructure. This is both because they have to go through the operational log,
38//! and because they must each be applied twice.
39//!
40//! # How does it work?
41//!
42//! Take a look at [this YouTube video](https://www.youtube.com/watch?v=eLNAMEoKAAc) which goes
43//! through the basic concurrency algorithm, as well as the initial development of this library.
44//! Alternatively, there's a shorter (but also less complete) description in [this
45//! talk](https://www.youtube.com/watch?v=s19G6n0UjsM&t=1994).
46//!
47//! At a glance, left-right is implemented using two regular `T`s, an operational log, epoch
48//! counting, and some pointer magic. There is a single pointer through which all readers go. It
49//! points to a `T` that the readers access in order to read data. Every time a read has accessed
50//! the pointer, they increment a local epoch counter, and they update it again when they have
51//! finished the read. When a write occurs, the writer updates the other `T` (for which there are
52//! no readers), and also stores a copy of the change in a log. When [`WriteHandle::publish`] is
53//! called, the writer, atomically swaps the reader pointer to point to the other `T`. It then
54//! waits for the epochs of all current readers to change, and then replays the operational log to
55//! bring the stale copy up to date.
56//!
57//! The design resembles this [left-right concurrency
58//! scheme](https://hal.archives-ouvertes.fr/hal-01207881/document) from 2015, though I am not
59//! aware of any follow-up to that work.
60//!
61//! # How do I use it?
62//!
63//! If you just want a data structure for fast reads, you likely want to use a crate that _uses_
64//! this crate, like [`evmap`](https://docs.rs/evmap/). If you want to develop such a crate
65//! yourself, here's what you do:
66//!
67//! ```rust
68//! use left_right::{Absorb, ReadHandle, WriteHandle};
69//!
70//! // First, define an operational log type.
71//! // For most real-world use-cases, this will be an `enum`, but we'll keep it simple:
72//! struct CounterAddOp(i32);
73//!
74//! // Then, implement the unsafe `Absorb` trait for your data structure type,
75//! // and provide the oplog type as the generic argument.
76//! // You can read this as "`i32` can absorb changes of type `CounterAddOp`".
77//! impl Absorb<CounterAddOp> for i32 {
78//! // See the documentation of `Absorb::absorb_first`.
79//! //
80//! // Essentially, this is where you define what applying
81//! // the oplog type to the datastructure does.
82//! fn absorb_first(&mut self, operation: &mut CounterAddOp, _: &Self) {
83//! *self += operation.0;
84//! }
85//!
86//! // See the documentation of `Absorb::absorb_second`.
87//! //
88//! // This may or may not be the same as `absorb_first`,
89//! // depending on whether or not you de-duplicate values
90//! // across the two copies of your data structure.
91//! fn absorb_second(&mut self, operation: CounterAddOp, _: &Self) {
92//! *self += operation.0;
93//! }
94//!
95//! // See the documentation of `Absorb::drop_first`.
96//! fn drop_first(self: Box<Self>) {}
97//!
98//! fn sync_with(&mut self, first: &Self) {
99//! *self = *first
100//! }
101//! }
102//!
103//! // Now, you can construct a new left-right over an instance of your data structure.
104//! // This will give you a `WriteHandle` that accepts writes in the form of oplog entries,
105//! // and a (cloneable) `ReadHandle` that gives you `&` access to the data structure.
106//! let (write, read) = left_right::new::<i32, CounterAddOp>();
107//!
108//! // You will likely want to embed these handles in your own types so that you can
109//! // provide more ergonomic methods for performing operations on your type.
110//! struct Counter(WriteHandle<i32, CounterAddOp>);
111//! impl Counter {
112//! // The methods on you write handle type will likely all just add to the operational log.
113//! pub fn add(&mut self, i: i32) {
114//! self.0.append(CounterAddOp(i));
115//! }
116//!
117//! // You should also provide a method for exposing the results of any pending operations.
118//! //
119//! // Until this is called, any writes made since the last call to `publish` will not be
120//! // visible to readers. See `WriteHandle::publish` for more details. Make sure to call
121//! // this out in _your_ documentation as well, so that your users will be aware of this
122//! // "weird" behavior.
123//! pub fn publish(&mut self) {
124//! self.0.publish();
125//! }
126//! }
127//!
128//! // Similarly, for reads:
129//! #[derive(Clone)]
130//! struct CountReader(ReadHandle<i32>);
131//! impl CountReader {
132//! pub fn get(&self) -> i32 {
133//! // The `ReadHandle` itself does not allow you to access the underlying data.
134//! // Instead, you must first "enter" the data structure. This is similar to
135//! // taking a `Mutex`, except that no lock is actually taken. When you enter,
136//! // you are given back a guard, which gives you shared access (through the
137//! // `Deref` trait) to the "read copy" of the data structure.
138//! //
139//! // Note that `enter` may yield `None`, which implies that the `WriteHandle`
140//! // was dropped, and took the backing data down with it.
141//! //
142//! // Note also that for as long as the guard lives, a writer that tries to
143//! // call `WriteHandle::publish` will be blocked from making progress.
144//! self.0.enter().map(|guard| *guard).unwrap_or(0)
145//! }
146//! }
147//!
148//! // These wrapper types are likely what you'll give out to your consumers.
149//! let (mut w, r) = (Counter(write), CountReader(read));
150//!
151//! // They can then use the type fairly ergonomically:
152//! assert_eq!(r.get(), 0);
153//! w.add(1);
154//! // no call to publish, so read side remains the same:
155//! assert_eq!(r.get(), 0);
156//! w.publish();
157//! assert_eq!(r.get(), 1);
158//! drop(w);
159//! // writer dropped data, so reads yield fallback value:
160//! assert_eq!(r.get(), 0);
161//! ```
162//!
163//! One additional noteworthy detail: much like with `Mutex`, `RwLock`, and `RefCell` from the
164//! standard library, the values you dereference out of a `ReadGuard` are tied to the lifetime of
165//! that `ReadGuard`. This can make it awkward to write ergonomic methods on the read handle that
166//! return references into the underlying data, and may tempt you to clone the data out or take a
167//! closure instead. Instead, consider using [`ReadGuard::map`] and [`ReadGuard::try_map`], which
168//! (like `RefCell`'s [`Ref::map`](std::cell::Ref::map)) allow you to provide a guarded reference
169//! deeper into your data structure.
170#![warn(
171 missing_docs,
172 rust_2018_idioms,
173 missing_debug_implementations,
174 broken_intra_doc_links
175)]
176#![allow(clippy::type_complexity)]
177
178mod sync;
179
180use crate::sync::{Arc, AtomicUsize, Mutex};
181
182type Epochs = Arc<Mutex<slab::Slab<Arc<AtomicUsize>>>>;
183
184mod write;
185pub use crate::write::Taken;
186pub use crate::write::WriteHandle;
187
188mod read;
189pub use crate::read::{ReadGuard, ReadHandle, ReadHandleFactory};
190
191pub mod aliasing;
192
193/// Types that can incorporate operations of type `O`.
194///
195/// This trait allows `left-right` to keep the two copies of the underlying data structure (see the
196/// [crate-level documentation](crate)) the same over time. Each write operation to the data
197/// structure is logged as an operation of type `O` in an _operational log_ (oplog), and is applied
198/// once to each copy of the data.
199///
200/// Implementations should ensure that the absorbption of each `O` is deterministic. That is, if
201/// two instances of the implementing type are initially equal, and then absorb the same `O`,
202/// they should remain equal afterwards. If this is not the case, the two copies will drift apart
203/// over time, and hold different values.
204///
205/// The trait provides separate methods for the first and second absorption of each `O`. For many
206/// implementations, these will be the same (which is why `absorb_second` defaults to calling
207/// `absorb_first`), but not all. In particular, some implementations may need to modify the `O` to
208/// ensure deterministic results when it is applied to the second copy. Or, they may need to
209/// ensure that removed values in the data structure are only dropped when they are removed from
210/// _both_ copies, in case they alias the backing data to save memory.
211///
212/// For the same reason, `Absorb` allows implementors to define `drop_first`, which is used to drop
213/// the first of the two copies. In this case, de-duplicating implementations may need to forget
214/// values rather than drop them so that they are not dropped twice when the second copy is
215/// dropped.
216pub trait Absorb<O> {
217 /// Apply `O` to the first of the two copies.
218 ///
219 /// `other` is a reference to the other copy of the data, which has seen all operations up
220 /// until the previous call to [`WriteHandle::publish`]. That is, `other` is one "publish
221 /// cycle" behind.
222 fn absorb_first(&mut self, operation: &mut O, other: &Self);
223
224 /// Apply `O` to the second of the two copies.
225 ///
226 /// `other` is a reference to the other copy of the data, which has seen all operations up to
227 /// the call to [`WriteHandle::publish`] that initially exposed this `O`. That is, `other` is
228 /// one "publish cycle" ahead.
229 ///
230 /// Note that this method should modify the underlying data in _exactly_ the same way as
231 /// `O` modified `other`, otherwise the two copies will drift apart. Be particularly mindful of
232 /// non-deterministic implementations of traits that are often assumed to be deterministic
233 /// (like `Eq` and `Hash`), and of "hidden states" that subtly affect results like the
234 /// `RandomState` of a `HashMap` which can change iteration order.
235 ///
236 /// Defaults to calling `absorb_first`.
237 fn absorb_second(&mut self, mut operation: O, other: &Self) {
238 Self::absorb_first(self, &mut operation, other)
239 }
240
241 /// Drop the first of the two copies.
242 ///
243 /// Defaults to calling `Self::drop`.
244 #[allow(clippy::boxed_local)]
245 fn drop_first(self: Box<Self>) {}
246
247 /// Drop the second of the two copies.
248 ///
249 /// Defaults to calling `Self::drop`.
250 #[allow(clippy::boxed_local)]
251 fn drop_second(self: Box<Self>) {}
252
253 /// Sync the data from `first` into `self`.
254 ///
255 /// To improve initialization performance, before the first call to `publish` changes aren't
256 /// added to the internal oplog, but applied to the first copy directly using `absorb_second`.
257 /// The first `publish` then calls `sync_with` instead of `absorb_second`.
258 ///
259 /// `sync_with` should ensure that `self`'s state exactly matches that of `first` after it
260 /// returns. Be particularly mindful of non-deterministic implementations of traits that are
261 /// often assumed to be deterministic (like `Eq` and `Hash`), and of "hidden states" that
262 /// subtly affect results like the `RandomState` of a `HashMap` which can change iteration
263 /// order.
264 fn sync_with(&mut self, first: &Self);
265}
266
267/// Construct a new write and read handle pair from an empty data structure.
268///
269/// The type must implement `Clone` so we can construct the second copy from the first.
270pub fn new_from_empty<T, O>(t: T) -> (WriteHandle<T, O>, ReadHandle<T>)
271where
272 T: Absorb<O> + Clone,
273{
274 let epochs = Default::default();
275
276 let r = ReadHandle::new(t.clone(), Arc::clone(&epochs));
277 let w = WriteHandle::new(t, epochs, r.clone());
278 (w, r)
279}
280
281/// Construct a new write and read handle pair from the data structure default.
282///
283/// The type must implement `Default` so we can construct two empty instances. You must ensure that
284/// the trait's `Default` implementation is deterministic and idempotent - that is to say, two
285/// instances created by it must behave _exactly_ the same. An example of where this is problematic
286/// is `HashMap` - due to `RandomState`, two instances returned by `Default` may have a different
287/// iteration order.
288///
289/// If your type's `Default` implementation does not guarantee this, you can use `new_from_empty`,
290/// which relies on `Clone` instead of `Default`.
291pub fn new<T, O>() -> (WriteHandle<T, O>, ReadHandle<T>)
292where
293 T: Absorb<O> + Default,
294{
295 let epochs = Default::default();
296
297 let r = ReadHandle::new(T::default(), Arc::clone(&epochs));
298 let w = WriteHandle::new(T::default(), epochs, r.clone());
299 (w, r)
300}