atomic_try_update/
lib.rs

1//! Primitives that make it easy to implement correct lock-free algorithms
2//!
3//! `atomic_try_update` is the main entry-point to this library, but (with
4//! the exception of `NonceStack`) the included example code is also designed
5//! to be used in production.  Each module implements a different family of
6//! example algorithms.  If you simply want to use general-purpose algorithms
7//! without modification, start with the public APIs of the data structures
8//! in those modules.
9//!
10//! If you want to start implementing your own specialized lock-free logic,
11//! start with this page, then read the top-level descriptions of each
12//! of the modules this crate exports.
13use std::{marker::PhantomData, ptr::null_mut};
14
15// AtomicCell uses a lock-based fallback for u128 because stable rust does
16// not include AtomicU128.
17//
18// I wonder if we could replace this with portable_atomic, which uses inline
19// assembly for u128, or upstream a feature flag to crossbeam_utils to use
20// portable_atomic where possible.
21//
22// https://docs.rs/portable-atomic/latest/portable_atomic/struct.AtomicU128.html
23use crossbeam_utils::atomic::AtomicCell;
24
25pub mod barrier;
26pub mod bits;
27pub mod claim;
28pub mod once;
29pub mod stack;
30
31/// A wrapper that allows an instance of type T to be treated as though it is
32/// an atomic integer type (in the style of a C/C++ union).  Use
33/// `atomic_try_update` to access the data of type `T` stored in an `Atom<T>`.
34///
35/// The generic parameter `U` is the smallest unsigned integer type that is large
36/// enough to hold an instance of T.  (Typically: `u64` or `u128`)
37pub struct Atom<T, U> {
38    union: PhantomData<T>,
39    inner: AtomicCell<U>,
40}
41
42impl<T, U> Default for Atom<T, U>
43where
44    U: Default + Send,
45{
46    /// This creates a new instance of Atom, initializing the contents to
47    /// all-zero bytes.
48    ///
49    /// TODO: Change this so that T implements Default, and then store the
50    /// default instance of T in the atom?  If we do that, what happens to
51    /// the uninitialized padding bytes?
52    fn default() -> Self {
53        assert!(std::mem::size_of::<T>() <= std::mem::size_of::<U>());
54        assert!(
55            std::mem::size_of::<U>() <= 4
56                || std::mem::size_of::<T>() > std::mem::size_of::<U>() / 2
57        );
58        Self {
59            union: Default::default(),
60            inner: Default::default(),
61        }
62    }
63}
64
65// TODO: Restrict these so that ptr T is OK, but most other things are not.
66// Also, it would be nice if the type of T was richer so that we could avoid
67// these.
68unsafe impl<T, U> Sync for Atom<T, U> {}
69unsafe impl<T, U> Send for Atom<T, U> {}
70
71/// This function is used to implement lock free synchronization primitives.
72///
73/// It is a special compare-and-swap loop that performs a hardware-level atomic
74/// integer load from a piece of memory that it manages, casts the byte
75/// representation of the integer that it read to a caller-provided type,
76/// and then passes the result into a caller-provided lambda.
77///
78/// The lambda optionally updates the fields of the struct that were passed into
79/// it.  If the lambda returns false, the loop terminates.  If it returns true,
80/// then *atomic_try_update* attempts to compare-and-swap the new version of the
81/// struct with the old version (by first casting it back to an integer).
82///
83/// The lambda actually returns a tuple of type `(bool, R)`.  The first field
84/// is used to decide whether or not to perform a compare and swap.  The second
85/// is passed back to the caller.  This allows the lambda to pass information
86/// to its calling environment without sharing mutable references with other
87/// invocations of itself or doing other things that the borrow checker disallows.
88///
89/// atomic_try_update is powerful for two reasons:
90///
91/// First, 128 bits is quite a lot of state and modern machines support 128-bit CAS.
92/// Pointers on 64-bit machines are ~ 40-48 bits wide, so it is enough space for
93/// two or three pointers with some bits left over to encode additional state.
94/// If even more bits are needed, you can play tricks such as storing offsets into
95/// an array instead of memory addresses.  This allows you to pack state machines
96/// and simple data structure (including stacks, registers, and even simple
97/// allocators) into a single integer value, then to atomically modify your data
98/// structures and state machines.
99///
100/// Second, simple, cursory reviews of the lambda code are enough to verify that
101/// the resulting algorithm is linearizable:  i.e., that all concurrent executions
102/// are equivalent to some single-threaded execution of the same code, and that
103/// all observers agree on the schedule.
104///
105/// The rules are described in the safety section.  There is no way for
106/// the rust compiler to check that your lambda obeys the provided rules, and
107/// violations of them can lead to memory corruption, borrow checker invariant
108/// violations, and other unsound behavior.  Therefore, we annotate the function
109/// "unsafe".
110///
111/// In particular, if T is a Box<...>, and the lambda overwrites its argument,
112/// then the old value in the Box could be double-freed.
113///
114/// # Safety
115///
116/// In order to use atomic_try_update safely, make sure your lambda follows
117/// the following two rules:
118///
119/// 1. It must implement *read set equivalence*.
120/// 2. It must be a pure (side-effect-free) function.
121///
122/// ## Rule 1: Read set equivalence
123///
124/// Read set equivalence is the invariant that if the current value of the
125/// `Atom` matches the pre-value read by `atomic_try_update`, then all data
126/// read by the lambda has the same value as it had when the lambda read it.
127///
128/// This concept is closely related to, but distinct from approaches used
129/// in database concurrency control algorithms.
130///
131/// Read set equivalence trivially holds if the lambda only reads the data
132/// that was passed directly into it, and doesn't follow pointers, references,
133/// or otherwise observe other non-local state in the system.
134///
135/// It is also fine for the lambda to read data from its caller's stack
136/// frame, as long as that data is immutable while atomic_try_update is
137/// running.  This is most easily achieved by only capturing shared references,
138/// and not capturing references to things that make use of interior mutability
139/// (such as Mutex, or any of the atomic integer types).
140///
141/// Another common approach involves using some extra mechanism to ensure read
142/// set equivalence.  For instance, some data structures include a nonce in the
143/// data stored by `Atom`, and increment it on each operation.  As long as the
144/// nonce does not wrap back around to exactly the same value just in time for
145/// the compare and swap to run, then we know that no other operations on this
146/// `Atom` have modified any state that we read in race with us.
147///
148/// The examples in the stack module explain read set equivalence in more detail.
149///
150/// ### Comparison with database concurrency control algorithms
151///
152/// Read set equivalence allows more schedules than typical database concurrency
153/// control algorithms.  In particular, it allows write-write and read-write
154/// conflicts in the case where the read set of the lambda ("transactions", in
155/// their context) has been changed, but then changed back to the value the
156/// transaction observed.  Two phase locking prevents such schedules by blocking
157/// execution of conflicting logic, and multi-version concurrency control prevents
158/// them by only examining the version number of the read set, and not its
159/// contents.  (The nonce trick we mentioned above is analogous to multi-version
160/// concurrency control.)
161///
162/// ## Rule 2: Pure functions
163///
164/// There are a few common ways for lambdas to fail to be pure functions,
165/// ordered from most to least likely:
166///
167/// The value that is stored in the the `Atom` could implicitly interact
168/// with global state.  For instance, it could be a `Box` and talk to the
169/// allocator, or it could attempt to perform I/O.
170///
171/// The lambda could speculatively read a value after it has been freed.
172/// Even if the lambda then discards the result without acting on it
173/// (which would be safe in a garbage collected language), the act of
174/// loading the freed value could read from memory that has been returned
175/// to the operating system, leading to a segmentation fault.  This is
176/// generally avoidable using an epoch based garbage collector, such as
177/// `crossbeam_epoch`, or by maintaining a pool of reused, but never
178/// freed objects for use by the data structure.
179///
180/// The lambda could speculatively read a value, store it on the heap
181/// or in a captured stack variable, and then return true.  If the
182/// compare and swap fails, the lambda runs again, and then fails to
183/// overwrite the state from the failed speculative read, then it
184/// violates linearizability.
185///
186/// Thanks to the borrow checker, it is fairly difficult to implement
187/// such a bug.  In particular, if your lambda attempts to capture a
188/// shared reference (`&mut`), it will fail to compile.  However, you
189/// could defeat this check via internal mutability.
190///
191/// Finally, the lambda could crash or exhibit undefined behavior.  Other
192/// versions of atomic_try_update include an "unsafe" (not in the rust
193/// sense) variant that allows torn reads.  This means that the bit
194/// representation of the object that is passed into your lambda could
195/// be invalid, violating Rust's safety guarantees, or causing sanity
196/// checks to fail, leading to panics.  Similarly, if the lambda follows
197/// a pointer to something that has been marked for reuse (and therefore,
198/// the compare and swap will fail), some other thread could modify that
199/// object in race with the current thread's failed speculative lambda
200/// invocation.
201pub unsafe fn atomic_try_update<T, U, F, R>(state: &Atom<T, U>, func: F) -> R
202where
203    F: Fn(&mut T) -> (bool, R),
204    U: Copy + Eq,
205{
206    let mut old = state.inner.load();
207    let mut newval = old;
208    loop {
209        let newval_ptr: *mut U = &mut newval;
210        let res;
211        unsafe {
212            let newval_ptr: *mut T = newval_ptr as *mut T;
213            res = func(&mut *newval_ptr);
214            if !res.0 {
215                return res.1;
216            }
217        }
218        match state.inner.compare_exchange(old, newval) {
219            Ok(_) => return res.1,
220            Err(val) => {
221                old = val;
222                newval = old;
223            }
224        }
225    }
226}
227
228/// A linked list node that contains an instance of type T and a raw pointer
229/// to the next entry in the node.  Since `atomic_try_update` speculatively
230/// executes code, it can not handle values of `Box<T>` soundly.  Therefore,
231/// this is the idiomatic way to store linked lists and stacks with
232/// `atomic_try_update`.
233///
234/// TODO: Work out safety for this API.
235#[derive(Debug)]
236pub struct Node<T> {
237    pub val: T,
238    pub next: *mut Node<T>,
239}
240
241unsafe impl<T> Send for NodeIterator<T> {}
242
243/// A consuming iterator over a value of type Node.
244///
245/// TODO: Document safety here, and (ideally) figure out how to
246/// allow people to write atomic_try_update lambdas from outside
247/// this package, but not write garbage to next from `safe` code.
248/// That way, this API won't need an `unsafe` annotation (which
249/// it is currently missing).
250pub struct NodeIterator<T> {
251    node: *mut Node<T>,
252}
253
254impl<T> Iterator for NodeIterator<T> {
255    type Item = T;
256
257    fn next(&mut self) -> Option<Self::Item> {
258        if self.node.is_null() {
259            return None;
260        }
261        let popped: Box<Node<T>> = unsafe { Box::from_raw(self.node) };
262        self.node = popped.next;
263        Some(popped.val)
264    }
265}
266
267impl<T> NodeIterator<T> {
268    /// Takes ownership of node, in the style of Box::from_raw
269    ///
270    /// TODO: This could take a Box, and then we wouldn't need to add an unsafe annotation to it.
271    pub fn new(node: *mut Node<T>) -> Self {
272        Self { node }
273    }
274
275    pub fn rev(mut self) -> Self {
276        let mut ret = Self { node: null_mut() };
277        while !self.node.is_null() {
278            let popped = self.node;
279            unsafe {
280                self.node = (*popped).next;
281                (*popped).next = ret.node;
282                ret.node = popped;
283            }
284        }
285        ret
286    }
287}
288
289impl<T> Drop for NodeIterator<T> {
290    fn drop(&mut self) {
291        for _ in self {}
292    }
293}