davenport/lib.rs
1//! Ergonomic thread-local workspaces for intermediate data.
2//!
3//! `davenport` is a microcrate with a simple API for working with thread-local data, like
4//! buffers for intermediate data. Here's a brief example of the `davenport` API:
5//!
6//! ```rust
7//! use davenport::{define_thread_local_workspace, with_thread_local_workspace};
8//!
9//! #[derive(Default)]
10//! pub struct MyWorkspace {
11//! index_buffer: Vec<usize>
12//! }
13//!
14//! define_thread_local_workspace!(WORKSPACE);
15//!
16//! fn median_floor(indices: &[usize]) -> Option<usize> {
17//! with_thread_local_workspace(&WORKSPACE, |workspace: &mut MyWorkspace| {
18//! // Re-use buffer from previous call to this function
19//! let buffer = &mut workspace.index_buffer;
20//! buffer.clear();
21//! buffer.copy_from_slice(&indices);
22//! buffer.sort_unstable();
23//! buffer.get(indices.len() / 2).copied()
24//! })
25//! }
26//! ```
27//! Thread local storage should be used with care. In the above example, if `indices` is large,
28//! then a large buffer may be allocated and not freed for the duration of the program. Since
29//! stand-alone functions that use thread local storage rarely have enough information to know
30//! whether the buffer should be kept alive or not, this may easily lead to unnecessary
31//! and redundant memory use.
32//!
33//! Try to find other solutions before reaching for thread-local data!
34//!
35//! ## Motivating example
36//!
37//! Let's say we have to compute the sum of a series of elements that are produced in
38//! variable-sized "chunks" by a `Producer`. For a fixed element type like `u32`, our code
39//! might for example look like this:
40//!
41//! ```rust
42//! pub trait Producer {
43//! fn num_elements(&self) -> usize;
44//! fn populate_buffer(&self, buffer: &mut [u32]);
45//! }
46//!
47//! fn compute_sum(producer: &dyn Producer) -> u32 {
48//! let mut buffer = vec![u32::MAX; producer.num_elements()];
49//! producer.populate_buffer(&mut buffer);
50//! buffer.iter().sum()
51//! }
52//! ```
53//!
54//! If we call this method over and over again, it might be prudent to try to avoid the constant
55//! re-allocation of the vector. Ideally we'd be able to store some persistent buffer in
56//! one of the function arguments, or have `compute_sum` be a method on an object with an
57//! internal buffer. However, sometimes we do not have this luxury, perhaps if we're constrained
58//! to fit into an existing API that does not allow for buffers to be passed in. An alternative
59//! then might be to store the buffer in thread-local storage. Using thread-local storage,
60//! the above `compute_sum` function might look like this:
61//!
62//! ```rust
63//! # pub trait Producer {
64//! # fn num_elements(&self) -> usize;
65//! # fn populate_buffer(&self, buffer: &mut [u32]);
66//! # }
67//! fn compute_sum(producer: &dyn Producer) -> u32 {
68//! thread_local! { static BUFFER: std::cell::RefCell<Vec<u32>> = Default::default(); }
69//! BUFFER.with(|rc| {
70//! let mut buffer = rc.borrow_mut();
71//! producer.populate_buffer(&mut *buffer);
72//! buffer.iter().sum()
73//! })
74//! }
75//! ```
76//! Now, let's imagine that we wanted our function to work with a more generic set of types,
77//! rather than `u32` alone. We generalize the `Producer` trait, but quickly realize
78//! that we cannot create a `thread_local!` buffer in the same way.
79//!
80//! ```ignore
81//! use std::ops::{Add, AddAssign};
82//!
83//! pub trait Producer<T> {
84//! fn num_elements(&self) -> usize;
85//! fn populate_buffer(&self, buffer: &mut [T]);
86//! }
87//!
88//! fn compute_sum<T>(producer: &dyn Producer<T>) -> T
89//! where
90//! T: 'static + Default + Copy + std::iter::Sum
91//! {
92//! // Does not compile!
93//! // error[E0401]: can't use generic parameters from outer function
94//! thread_local! { static BUFFER: std::cell::RefCell<Vec<T>> = Default::default(); }
95//! BUFFER.with(|rc| {
96//! let mut buffer = rc.borrow_mut();
97//! buffer.resize(producer.num_elements(), T::default());
98//! producer.populate_buffer(&mut *buffer);
99//! buffer.iter()
100//! .copied()
101//! .sum()
102//! })
103//! }
104//! ```
105//!
106//! It turns out that it's generally difficult to construct a thread local workspace that's
107//! *generic* in its type. However, we can do this with `davenport`:
108//! ```rust
109//! use davenport::{define_thread_local_workspace, with_thread_local_workspace};
110//! use std::ops::{Add, AddAssign};
111//!
112//! # pub trait Producer<T> {
113//! # fn num_elements(&self) -> usize;
114//! # fn populate_buffer(&self, buffer: &mut [T]);
115//! # }
116//! #
117//! fn compute_sum<T>(producer: &dyn Producer<T>) -> T
118//! where
119//! T: 'static + Default + Copy + std::iter::Sum
120//! {
121//! define_thread_local_workspace!(WORKSPACE);
122//! with_thread_local_workspace(&WORKSPACE, |buffer: &mut Vec<T>| {
123//! buffer.resize(producer.num_elements(), T::default());
124//! producer.populate_buffer(buffer);
125//! buffer.iter()
126//! .copied()
127//! .sum()
128//! })
129//! }
130//! ```
131//!
132//! `davenport` gets around the aforementioned restrictions because the *actual* thread-local
133//! variable is an instance of [`Workspace`], which is a container for type-erased work spaces.
134//! Thus, what is really happening in the example above is that a thread-local [`Workspace`] type
135//! is constructed, which we ask for a mutable reference to `Vec<T>`. If the buffer does not
136//! yet exist, it is default-constructed. Otherwise we obtain a previously-used instance.
137//!
138//! # Limitations
139//!
140//! Currently, trying to access the same workspace variable (`WORKSPACE` in the above examples)
141//! recursively with [`with_thread_local_workspace`] will panic, as it relies on
142//! mutably borrowing through [`RefCell`](`std::cell::RefCell`).
143//! While this restriction could technically
144//! be lifted at the cost of increased complexity in `davenport`, it rarely arises in practice
145//! when using sufficiently local workspaces, as opposed to sharing a single workspace variable
146//! across entire crates.
147//!
148
149use std::any::Any;
150use std::cell::RefCell;
151use std::thread::LocalKey;
152
153/// A workspace that contains type-erased objects.
154///
155/// The workspace is intended to hold intermediate data used as workspace in computations.
156/// It is optimized particularly for the case where the same type is accessed many times in a row.
157///
158/// Usually you do not need to use this type directly. Instead, use
159/// [`define_thread_local_workspace`] in conjunction with
160/// [`with_thread_local_workspace`] as described in the
161/// [crate-level documentation](`crate`).
162#[derive(Debug, Default)]
163pub struct Workspace {
164 workspaces: Vec<Box<dyn Any>>,
165}
166
167impl Workspace {
168 /// Attempts to insert the given object into the workspace.
169 ///
170 /// If the insertion was successful, a reference to the object is returned. Otherwise,
171 /// `None` is returned.
172 #[must_use]
173 pub fn try_insert<W: 'static>(&mut self, w: W) -> Option<&mut W> {
174 if self.find_index_of::<W>().is_none() {
175 self.workspaces.push(Box::from(w));
176 self.workspaces.last_mut().unwrap().downcast_mut()
177 } else {
178 None
179 }
180 }
181
182 pub fn try_get<W: 'static>(&self) -> Option<&W> {
183 self.workspaces
184 .iter()
185 .rev()
186 .find_map(|ws| ws.downcast_ref())
187 }
188
189 pub fn try_get_mut<W: 'static>(&mut self) -> Option<&mut W> {
190 self.workspaces
191 .iter_mut()
192 .rev()
193 .find_map(|ws| ws.downcast_mut())
194 }
195
196 pub fn get_or_insert_with<W, F>(&mut self, create: F) -> &mut W
197 where
198 W: 'static,
199 F: FnOnce() -> W,
200 {
201 let existing_ws_idx = self.find_index_of::<W>();
202 let idx = match existing_ws_idx {
203 Some(idx) => idx,
204 None => {
205 let w = create();
206 let idx = self.workspaces.len();
207 self.workspaces.push(Box::new(w) as Box<dyn Any>);
208 idx
209 }
210 };
211
212 // We heuristically assume that the same object is likely to be accessed
213 // many times in sequence. Therefore we make sure that the object is the last entry,
214 // so that on the next lookup, we'll immediately find the correct object
215 let last = self.workspaces.len() - 1;
216 self.workspaces.swap(idx, last);
217
218 let entry = &mut self.workspaces[last];
219 entry
220 .downcast_mut()
221 .expect("Internal error: Downcasting can by definition not fail")
222 }
223
224 pub fn get_or_default<W>(&mut self) -> &mut W
225 where
226 W: 'static + Default,
227 {
228 self.get_or_insert_with(Default::default)
229 }
230
231 fn find_index_of<W: 'static>(&self) -> Option<usize> {
232 // Note: We treat the Vec as a stack, so we search from the end of the vector.
233 self.workspaces.iter().rposition(|ws| ws.is::<W>())
234 }
235}
236
237/// Runs the provided closure with the thread-local workspace as an argument.
238///
239/// This simplifies working with [`Workspace`] when it's stored as a thread-local variable.
240///
241/// Note that the typed workspace must have a [`Default`] implementation.
242///
243/// See the [crate-level documentation](`crate`) for typical usage examples.
244///
245/// ## Panics
246///
247/// Panics if used recursively with the same workspace variable, as it relies on
248/// mutably borrowing through [`RefCell`](`std::cell::RefCell`). See the crate-level documentation for
249/// a discussion of this limitation.
250pub fn with_thread_local_workspace<W: 'static + Default, T>(
251 workspace: &'static LocalKey<RefCell<Workspace>>,
252 f: impl FnOnce(&mut W) -> T,
253) -> T {
254 workspace.with(|refcell_ws| {
255 let mut type_erased_workspace = refcell_ws.try_borrow_mut().expect(
256 "Internal error: Can not recursively use the same workspace variable. \
257 See discussion on limitations in davenport's crate-level documentation.",
258 );
259 let workspace = type_erased_workspace.get_or_default();
260 f(workspace)
261 })
262}
263
264/// Helper macro for easily defining thread-local workspaces.
265///
266/// See the [crate-level documentation](`crate`) for usage instructions.
267#[macro_export]
268macro_rules! define_thread_local_workspace {
269 ($variable_name:ident) => {
270 thread_local! {
271 static $variable_name: std::cell::RefCell<$crate::Workspace>
272 = std::cell::RefCell::new($crate::Workspace::default());
273 }
274 };
275}