1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
//! # KCAS
//!
//! A lock-free multi-word compare-and-swap library. It is `no_std`-compatible and optionally does
//! not allocate on the heap. Because it only requires single-width atomic compare-and-swap, it is
//! lock-free on most platforms, including RISC-V, AArch64, and x86-64.
//!
//! # Usage
//! ## Example
//! Here is an example using [RefStateWrapper]:
//! ```edition2021
//! use kcas::{KCasWord, State, RefStateWrapper};
//! use kcas::Error;
//! use std::sync::atomic::{AtomicUsize, Ordering};
//! use std::thread::{ScopedJoinHandle};
//!
//! // Allocate memory on the stack for 2 threads, each operating on 3 words.
//! let state: State<2, 3> = State::new();
//!
//! // each thread gets its own wrapper
//! let mut first_wrapper: RefStateWrapper<2, 3> = RefStateWrapper::construct(&state).unwrap();
//! let mut second_wrapper: RefStateWrapper<2, 3> = RefStateWrapper::construct(&state).unwrap();
//!
//! let first_target: AtomicUsize = AtomicUsize::new(1);
//! let second_target: AtomicUsize = AtomicUsize::new(2);
//! let third_target: AtomicUsize = AtomicUsize::new(3);
//!
//! // run k-CAS operations in two threads
//! std::thread::scope(|scope| {
//! let first_handle: ScopedJoinHandle<Result<(), Error>> = scope.spawn(|| {
//! first_wrapper.kcas([
//! KCasWord::new(&first_target, 1, 4),
//! KCasWord::new(&second_target, 2, 5),
//! KCasWord::new(&third_target, 3, 6),
//! ])
//! });
//! let second_handle: ScopedJoinHandle<Result<(), Error>> = scope.spawn(|| {
//! second_wrapper.kcas([
//! KCasWord::new(&first_target, 1, 7),
//! KCasWord::new(&second_target, 2, 8),
//! KCasWord::new(&third_target, 3, 9),
//! ])
//! });
//! });
//!
//! // one of the two operations should succeed
//! assert!(
//! (first_target.load(Ordering::Acquire) == 4
//! && second_target.load(Ordering::Acquire) == 5
//! && third_target.load(Ordering::Acquire) == 6
//! ) || (
//! first_target.load(Ordering::Acquire) == 7
//! && second_target.load(Ordering::Acquire) == 8
//! && third_target.load(Ordering::Acquire) == 9
//! )
//! );
//! ```
//!
//! ## Details
//! Begin by instantiating a [State], which is a struct shared between threads so threads can
//! assist each other. [State] reserves all the memory it needs to share during initialization.
//! [State] requires two generic const arguments:
//! - `NUM_THREADS`, the maximum number of threads which can operate on that State at any given
//! time, and
//! - `NUM_WORDS`, the number of words to operate upon during each k-CAS operation.
//!
//! See the [Limitations] section for limits on these parameters.
//!
//! Next, wrap [State] in a "wrapper" implementation. Each wrapper reserves a thread id from the
//! underlying [State] and thus is designed to execute k-CAS operations for exactly one thread at a
//! time. There are three wrapper implementations:
//! - [RefStateWrapper]'s constructor takes a `&State` where [State] typically lives on the stack.
//! [std::thread::scope] allows borrowing [RefStateWrapper] across threads.
//! - [ArcStateWrapper]'s constructor takes `Arc<State>` where [State] lives on the
//! heap and the `Arc` is cloned and moved across threads. This can only be used in a
//! `std` environment or a `no_std` environment which uses [alloc].
//! - [UnsafeStateWrapper]'s constructor takes a `NonNull<State>` for greater control
//! over where [State] lives and how it is used across threads.
//!
//! Next, call the wrapper's `kcas` method, which performs the k-CAS operation defined by its
//! argument, a [KCasWord] array of size `NUM_WORDS`. Each individual CAS operation occurs in order.
//!
//! The wrapper can be dropped once all k-CAS operations are finished for a particular thread. The
//! drop call will return the wrapper's thread id back into the pool of available thread ids.
//!
//! # Limitations
//! ## Target addresses must be used across threads in the same order
//! When two target addresses are passed into k-CAS operations in two threads in the opposite order,
//! a deadlock can occur. As a result, any addresses used in two or more threads must be passed in
//! the same order.
//!
//! To illustrate, let's say we would like a thread to perform a 3-CAS operation across three target
//! addresses `(1, 2, 3)`. If another thread is also performing operations, this second thread can
//! operate on any of the first thread's addresses individually in any position - for example, the
//! second thread could operate across `(4, 1, 5)`, `(4, 5, 2)`, or `(3, 4, 5)`. But, if the second
//! thread would like to operate on two of the first thread's addresses, those addresses must be
//! specified in the same order. In this case, `(1, 2, 4)`, `(4, 1, 3)`, `(1, 5, 3)` are examples of
//! valid orderings. However, `(2, 1, 4)` is not a valid ordering because `1` was specified before
//! `2` in the first thread, but `1` was specified after `2` in the second thread.
//!
//! ## The number of bits used by values bounds the size of `NUM_THREADS`
//! While there is no limit to the size of `NUM_WORDS` besides memory consumption, `NUM_THREADS` is
//! limited by the number of value bits you would like to use in your application for real content.
//! As an intermediate step, KCAS swaps tagged markers into the provided target addresses where the
//! most significant bits are a thread id and the least significant bits are a sequence number.
//! KCAS differentiates between tagged markers and real values by looking at the most significant
//! bits: if they are all 0s or all 1s, then they are real values; otherwise, they are treated as
//! tagged markers.
//!
//! This limitation makes the KCAS library unsuitable for performing k-CAS operations on
//! random values which must take up all bits of a usize. This also limits the number of threads
//! available for performing k-CAS operations on pointers. For example, on 64-bit systems,
//! - if you would like to perform k-CAS on 48-bit x86-64 pointers and the top 16 bits of the value
//! will always be all 0s or all 1s, `NUM_THREADS` can be as high as `2^12` or `65536`.
//! - for 52-bit ARMv8.2 LVA pointers, you may set `NUM_THREADS` as high as `2^12` or `4096`.
//! - for 57-bit 5-level paging pointers, you may set `NUM_THREADS` as high as `2^7` or `128`.
//!
//! To programmatically determine how many value bits are reserved by the KCAS library, you may call
//! [get_bit_length_of_num_threads].
//!
//! We plan to remove this limitation in the future by introducing an alternative implementation
//! where tagged markers do not scale with the number of threads. This design will rely on a
//! different mechanism for helping threads which should make the algorithm wait-free.
//!
pub use ;
pub use ;
pub use ;
pub use ;