left_right/lib.rs
1//! A concurrency primitive for high concurrency reads over a single-writer data structure.
2//!
3//! The primitive keeps two copies of the backing data structure, one that is accessed by readers,
4//! and one that is access by the (single) writer. This enables all reads to proceed in parallel
5//! with minimal coordination, and shifts the coordination overhead to the writer. In the absence
6//! of writes, reads scale linearly with the number of cores.
7//!
8//! When the writer wishes to expose new changes to the datastructure (see
9//! [`WriteHandle::publish`]), it "flips" the two copies so that subsequent reads go to the old
10//! "write side", and future writers go to the old "read side". This process does cause two cache
11//! line invalidations for the readers, but does not stop them from making progress (i.e., reads
12//! are wait-free).
13//!
14//! In order to keep both copies up to date, left-right keeps an operational log ("oplog") of all
15//! the modifications to the data structure, which it uses to bring the old read data up to date
16//! with the latest writes on a flip. Since there are two copies of the data, each oplog entry is
17//! applied twice: once to the write copy and again to the (stale) read copy.
18//!
19//! # Trade-offs
20//!
21//! Few concurrency wins come for free, and this one is no exception. The drawbacks of this
22//! primitive are:
23//!
24//! - **Increased memory use**: since we keep two copies of the backing data structure, we are
25//! effectively doubling the memory use of the underlying data. With some clever de-duplication,
26//! this cost can be ameliorated to some degree, but it's something to be aware of. Furthermore,
27//! if writers only call `publish` infrequently despite adding many writes to the operational log,
28//! the operational log itself may grow quite large, which adds additional overhead.
29//! - **Deterministic operations**: as the entries in the operational log are applied twice, once
30//! to each copy of the data, it is essential that the operations are deterministic. If they are
31//! not, the two copies will no longer mirror one another, and will continue to diverge over time.
32//! - **Single writer**: left-right only supports a single writer. To have multiple writers, you
33//! need to ensure exclusive access to the [`WriteHandle`] through something like a
34//! [`Mutex`](std::sync::Mutex).
35//! - **Slow writes**: Writes through left-right are slower than they would be directly against
36//! the backing datastructure. This is both because they have to go through the operational log,
37//! and because they must each be applied twice.
38//!
39//! # How does it work?
40//!
41//! Take a look at [this YouTube video](https://www.youtube.com/watch?v=eLNAMEoKAAc) which goes
42//! through the basic concurrency algorithm, as well as the initial development of this library.
43//! Alternatively, there's a shorter (but also less complete) description in [this
44//! talk](https://www.youtube.com/watch?v=s19G6n0UjsM&t=1994).
45//!
46//! At a glance, left-right is implemented using two regular `T`s, an operational log, epoch
47//! counting, and some pointer magic. There is a single pointer through which all readers go. It
48//! points to a `T` that the readers access in order to read data. Every time a read has accessed
49//! the pointer, they increment a local epoch counter, and they update it again when they have
50//! finished the read. When a write occurs, the writer updates the other `T` (for which there are
51//! no readers), and also stores a copy of the change in a log. When [`WriteHandle::publish`] is
52//! called, the writer, atomically swaps the reader pointer to point to the other `T`. It then
53//! waits for the epochs of all current readers to change, and then replays the operational log to
54//! bring the stale copy up to date.
55//!
56//! The design resembles this [left-right concurrency
57//! scheme](https://hal.archives-ouvertes.fr/hal-01207881/document) from 2015, though I am not
58//! aware of any follow-up to that work.
59//!
60//! # How do I use it?
61//!
62//! If you just want a data structure for fast reads, you likely want to use a crate that _uses_
63//! this crate, like [`evmap`](https://docs.rs/evmap/). If you want to develop such a crate
64//! yourself, here's what you do:
65//!
66//! ```rust
67//! use left_right::{Absorb, ReadHandle, WriteHandle};
68//!
69//! // First, define an operational log type.
70//! // For most real-world use-cases, this will be an `enum`, but we'll keep it simple:
71//! struct CounterAddOp(i32);
72//!
73//! // Then, implement the unsafe `Absorb` trait for your data structure type,
74//! // and provide the oplog type as the generic argument.
75//! // You can read this as "`i32` can absorb changes of type `CounterAddOp`".
76//! impl Absorb<CounterAddOp> for i32 {
77//! // See the documentation of `Absorb::absorb_first`.
78//! //
79//! // Essentially, this is where you define what applying
80//! // the oplog type to the datastructure does.
81//! fn absorb_first(&mut self, operation: &mut CounterAddOp, _: &Self) {
82//! *self += operation.0;
83//! }
84//!
85//! // See the documentation of `Absorb::absorb_second`.
86//! //
87//! // This may or may not be the same as `absorb_first`,
88//! // depending on whether or not you de-duplicate values
89//! // across the two copies of your data structure.
90//! fn absorb_second(&mut self, operation: CounterAddOp, _: &Self) {
91//! *self += operation.0;
92//! }
93//!
94//! // See the documentation of `Absorb::drop_first`.
95//! fn drop_first(self: Box<Self>) {}
96//!
97//! fn sync_with(&mut self, first: &Self) {
98//! *self = *first
99//! }
100//! }
101//!
102//! // Now, you can construct a new left-right over an instance of your data structure.
103//! // This will give you a `WriteHandle` that accepts writes in the form of oplog entries,
104//! // and a (cloneable) `ReadHandle` that gives you `&` access to the data structure.
105//! let (write, read) = left_right::new::<i32, CounterAddOp>();
106//!
107//! // You will likely want to embed these handles in your own types so that you can
108//! // provide more ergonomic methods for performing operations on your type.
109//! struct Counter(WriteHandle<i32, CounterAddOp>);
110//! impl Counter {
111//! // The methods on you write handle type will likely all just add to the operational log.
112//! pub fn add(&mut self, i: i32) {
113//! self.0.append(CounterAddOp(i));
114//! }
115//!
116//! // You should also provide a method for exposing the results of any pending operations.
117//! //
118//! // Until this is called, any writes made since the last call to `publish` will not be
119//! // visible to readers. See `WriteHandle::publish` for more details. Make sure to call
120//! // this out in _your_ documentation as well, so that your users will be aware of this
121//! // "weird" behavior.
122//! pub fn publish(&mut self) {
123//! self.0.publish();
124//! }
125//! }
126//!
127//! // Similarly, for reads:
128//! #[derive(Clone)]
129//! struct CountReader(ReadHandle<i32>);
130//! impl CountReader {
131//! pub fn get(&self) -> i32 {
132//! // The `ReadHandle` itself does not allow you to access the underlying data.
133//! // Instead, you must first "enter" the data structure. This is similar to
134//! // taking a `Mutex`, except that no lock is actually taken. When you enter,
135//! // you are given back a guard, which gives you shared access (through the
136//! // `Deref` trait) to the "read copy" of the data structure.
137//! //
138//! // Note that `enter` may yield `None`, which implies that the `WriteHandle`
139//! // was dropped, and took the backing data down with it.
140//! //
141//! // Note also that for as long as the guard lives, a writer that tries to
142//! // call `WriteHandle::publish` will be blocked from making progress.
143//! self.0.enter().map(|guard| *guard).unwrap_or(0)
144//! }
145//! }
146//!
147//! // These wrapper types are likely what you'll give out to your consumers.
148//! let (mut w, r) = (Counter(write), CountReader(read));
149//!
150//! // They can then use the type fairly ergonomically:
151//! assert_eq!(r.get(), 0);
152//! w.add(1);
153//! // no call to publish, so read side remains the same:
154//! assert_eq!(r.get(), 0);
155//! w.publish();
156//! assert_eq!(r.get(), 1);
157//! drop(w);
158//! // writer dropped data, so reads yield fallback value:
159//! assert_eq!(r.get(), 0);
160//! ```
161//!
162//! One additional noteworthy detail: much like with `Mutex`, `RwLock`, and `RefCell` from the
163//! standard library, the values you dereference out of a `ReadGuard` are tied to the lifetime of
164//! that `ReadGuard`. This can make it awkward to write ergonomic methods on the read handle that
165//! return references into the underlying data, and may tempt you to clone the data out or take a
166//! closure instead. Instead, consider using [`ReadGuard::map`] and [`ReadGuard::try_map`], which
167//! (like `RefCell`'s [`Ref::map`](std::cell::Ref::map)) allow you to provide a guarded reference
168//! deeper into your data structure.
169#![warn(
170 missing_docs,
171 rust_2018_idioms,
172 missing_debug_implementations,
173 broken_intra_doc_links
174)]
175#![allow(clippy::type_complexity)]
176
177mod sync;
178
179use crate::sync::{Arc, AtomicUsize, Mutex};
180
181type Epochs = Arc<Mutex<slab::Slab<Arc<AtomicUsize>>>>;
182
183mod write;
184pub use crate::write::Taken;
185pub use crate::write::WriteHandle;
186
187mod read;
188pub use crate::read::{ReadGuard, ReadHandle, ReadHandleFactory};
189
190pub mod aliasing;
191
192/// Types that can incorporate operations of type `O`.
193///
194/// This trait allows `left-right` to keep the two copies of the underlying data structure (see the
195/// [crate-level documentation](crate)) the same over time. Each write operation to the data
196/// structure is logged as an operation of type `O` in an _operational log_ (oplog), and is applied
197/// once to each copy of the data.
198///
199/// Implementations should ensure that the absorbption of each `O` is deterministic. That is, if
200/// two instances of the implementing type are initially equal, and then absorb the same `O`,
201/// they should remain equal afterwards. If this is not the case, the two copies will drift apart
202/// over time, and hold different values.
203///
204/// The trait provides separate methods for the first and second absorption of each `O`. For many
205/// implementations, these will be the same (which is why `absorb_second` defaults to calling
206/// `absorb_first`), but not all. In particular, some implementations may need to modify the `O` to
207/// ensure deterministic results when it is applied to the second copy. Or, they may need to
208/// ensure that removed values in the data structure are only dropped when they are removed from
209/// _both_ copies, in case they alias the backing data to save memory.
210///
211/// For the same reason, `Absorb` allows implementors to define `drop_first`, which is used to drop
212/// the first of the two copies. In this case, de-duplicating implementations may need to forget
213/// values rather than drop them so that they are not dropped twice when the second copy is
214/// dropped.
215pub trait Absorb<O> {
216 /// Apply `O` to the first of the two copies.
217 ///
218 /// `other` is a reference to the other copy of the data, which has seen all operations up
219 /// until the previous call to [`WriteHandle::publish`]. That is, `other` is one "publish
220 /// cycle" behind.
221 fn absorb_first(&mut self, operation: &mut O, other: &Self);
222
223 /// Apply `O` to the second of the two copies.
224 ///
225 /// `other` is a reference to the other copy of the data, which has seen all operations up to
226 /// the call to [`WriteHandle::publish`] that initially exposed this `O`. That is, `other` is
227 /// one "publish cycle" ahead.
228 ///
229 /// Note that this method should modify the underlying data in _exactly_ the same way as
230 /// `O` modified `other`, otherwise the two copies will drift apart. Be particularly mindful of
231 /// non-deterministic implementations of traits that are often assumed to be deterministic
232 /// (like `Eq` and `Hash`), and of "hidden states" that subtly affect results like the
233 /// `RandomState` of a `HashMap` which can change iteration order.
234 ///
235 /// Defaults to calling `absorb_first`.
236 fn absorb_second(&mut self, mut operation: O, other: &Self) {
237 Self::absorb_first(self, &mut operation, other)
238 }
239
240 /// Drop the first of the two copies.
241 ///
242 /// Defaults to calling `Self::drop`.
243 #[allow(clippy::boxed_local)]
244 fn drop_first(self: Box<Self>) {}
245
246 /// Drop the second of the two copies.
247 ///
248 /// Defaults to calling `Self::drop`.
249 #[allow(clippy::boxed_local)]
250 fn drop_second(self: Box<Self>) {}
251
252 /// Sync the data from `first` into `self`.
253 ///
254 /// To improve initialization performance, before the first call to `publish` changes aren't
255 /// added to the internal oplog, but applied to the first copy directly using `absorb_second`.
256 /// The first `publish` then calls `sync_with` instead of `absorb_second`.
257 ///
258 /// `sync_with` should ensure that `self`'s state exactly matches that of `first` after it
259 /// returns. Be particularly mindful of non-deterministic implementations of traits that are
260 /// often assumed to be deterministic (like `Eq` and `Hash`), and of "hidden states" that
261 /// subtly affect results like the `RandomState` of a `HashMap` which can change iteration
262 /// order.
263 fn sync_with(&mut self, first: &Self);
264}
265
266/// Construct a new write and read handle pair from an empty data structure.
267///
268/// The type must implement `Clone` so we can construct the second copy from the first.
269pub fn new_from_empty<T, O>(t: T) -> (WriteHandle<T, O>, ReadHandle<T>)
270where
271 T: Absorb<O> + Clone,
272{
273 let epochs = Default::default();
274
275 let r = ReadHandle::new(t.clone(), Arc::clone(&epochs));
276 let w = WriteHandle::new(t, epochs, r.clone());
277 (w, r)
278}
279
280/// Construct a new write and read handle pair from the data structure default.
281///
282/// The type must implement `Default` so we can construct two empty instances. You must ensure that
283/// the trait's `Default` implementation is deterministic and idempotent - that is to say, two
284/// instances created by it must behave _exactly_ the same. An example of where this is problematic
285/// is `HashMap` - due to `RandomState`, two instances returned by `Default` may have a different
286/// iteration order.
287///
288/// If your type's `Default` implementation does not guarantee this, you can use `new_from_empty`,
289/// which relies on `Clone` instead of `Default`.
290pub fn new<T, O>() -> (WriteHandle<T, O>, ReadHandle<T>)
291where
292 T: Absorb<O> + Default,
293{
294 let epochs = Default::default();
295
296 let r = ReadHandle::new(T::default(), Arc::clone(&epochs));
297 let w = WriteHandle::new(T::default(), epochs, r.clone());
298 (w, r)
299}