Expand description
Stack-like memory allocator with double-ended allocation support.
§Table of Contents
- Overview
- Basic Walkthrough
- Additional Operations
- Optional Crate Features
- Implementation Notes
- Known Issues
- Example - Temporary Thread-local Allocations
§Overview
Scratchpad
provides a method for quick and safe dynamic allocations of
arbitrary types without relying on the global heap (e.g. using Box
or
Vec
). Allocations are made from a fixed-size region of memory in a
stack-like fashion using two separate stacks (one for each end of the
allocation buffer) to allow different types of allocations with
independent lifecycles to be made from each end.
Such allocators are commonly used in game development, but are also useful in general for short-lived allocations or groups of allocations that share a common lifetime. While not quite as flexible as heap allocations, allocations from a stack allocator are usually much faster and are isolated from the rest of the heap, reducing memory fragmentation.
In addition to general allocation operations, this crate also allows for adding to existing allocations and concatenating adjacent allocations.
§Basic Walkthrough
The following is a step-by-step example of how to create a scratchpad and use it for basic allocation operations.
§Allocating Pool Storage
A scratchpad instance relies on two buffers of memory:
- The pool from which memory is allocated.
- A separate buffer for tracking allocation “markers”. A marker defines a context in which a group of allocations is made. Objects are allocated through markers, and memory is only ever truly “freed” back to the pool when the marker is dropped.
To keep things flexible, the user is required to provide either a static
array, boxed slice, or borrowed slice reference for each. Any basic
integer type (u8
, i64
, etc.) can be used as the array or slice element
type, as well as the CacheAligned
type provided by this crate if
cache-aligned memory is desired.
A handful of macros and
functions are provided to
help with defining buffers with specific storage requirements. Macros and
functions related specifically to allocation pool storage are suffixed
with “_for_bytes
”, while macros and functions related specifically to
marker tracking storage are suffixed with “_for_markers
”. Here, we use
uninitialized_boxed_slice_for_bytes()
and
uninitialized_boxed_slice_for_markers()
to create two boxed slices of
cache-aligned memory for our allocation pool and tracking buffer.
use scratchpad::uninitialized_boxed_slice_for_bytes;
use scratchpad::uninitialized_boxed_slice_for_markers;
use scratchpad::CacheAligned;
use std::mem::MaybeUninit;
const POOL_SIZE: usize = 1usize * 1024 * 1024; // 1 MB
const MARKERS_MAX: usize = 16;
let pool = unsafe {
uninitialized_boxed_slice_for_bytes::<MaybeUninit<CacheAligned>>(
POOL_SIZE
)
};
let tracking = unsafe {
uninitialized_boxed_slice_for_markers::<MaybeUninit<CacheAligned>>(
MARKERS_MAX
)
};
§Creating the Scratchpad Instance
Now that we’ve defined the storage that our scratchpad will use, we can
create a Scratchpad
instance using this storage. Simply call
Scratchpad::new()
, passing the allocation pool and marker tracking
buffer as arguments.
use scratchpad::Scratchpad;
let scratchpad = Scratchpad::new(pool, tracking);
If you decide to use static arrays for both allocation storage and marker
tracking, Scratchpad::static_new()
can be used to create the
scratchpad without having to construct the arrays separately.
Note using large static array buffers can cause the program to run
out of stack space when using new()
or
static_new()
.
Scratchpad::static_new_in_place()
provides the ability to create a
scratchpad within a specific block of (uninitialized) memory while
guaranteeing that the allocation and marker tracking buffers never touch
the stack. Keep in mind that this function is unsafe
, so you may need to
use either boxed slices or slices of externally owned arrays instead of
static arrays if you wish to avoid unsafe
code.
§Creating Allocation Markers
When we’re ready to allocate from the scratchpad, we must first create a
Marker
. A marker defines a context in which allocations will be made.
Multiple allocations can be made from a single marker, but the memory
isn’t freed back to the scratchpad until the marker is dropped, even if
the individual allocations are dropped. This helps keep things fast, as
the scratchpad only needs to perform a limited amount of tracking of
allocated memory.
Markers can be created from either the “front” or “back” of a scratchpad.
Front markers, created using Scratchpad::mark_front()
, allocate memory
from the start of the allocation pool upward, while back markers, created
using Scratchpad::mark_back()
, allocate memory from the end of the
allocation pool downward. The stacks of such allocations are handled
separately, so front markers don’t affect back markers and vice-versa
(with the exception of how much space is still available in the
scratchpad).
Here, we create a front marker for subsequent allocations.
let marker = scratchpad.mark_front().unwrap();
§Allocating Data
Each marker provides a variety of allocation functions, allowing
allocations of single elements or dynamically sized arrays of any type.
Allocation functions can be found in the Marker
docs, and are prefixed
with “allocate_
” (most Marker
trait functions are also wrapped by
methods of the MarkerFront
and MarkerBack
types as well, so you
don’t need to import the Marker
trait into scope to use them).
Allocations are wrapped in an Allocation
instance. Allocation
implements Deref
and DerefMut
, allowing access to the wrapped data
either explicitly using the unary *
operator (e.g. *allocation
) or
implicitly, such as when calling methods provided by the allocated data
type (e.g. allocation.len()
for retrieving the number of elements in an
Allocation<[i32]>
). Allocations also implement StableDeref
, allowing
them to be used with crates that support the trait such as owning_ref
and rental
.
// Allocate a single `f32`.
let f32_allocation = marker.allocate(3.14159f32).unwrap();
assert_eq!(*f32_allocation, 3.14159f32);
// Allocate an array of `i32` values.
let i32_array_allocation = marker
.allocate_array_with(5, |index| index as i32 * 2)
.unwrap();
assert_eq!(*i32_array_allocation, [0, 2, 4, 6, 8]);
One thing to note is that allocations can only be made from the most recently created marker belonging to a given stack (“front” or “back”). A marker can be used again for allocations once any more-recent markers from the same stack are dropped. Creating a new front marker doesn’t prevent allocations from being made from any back markers, and vice-versa.
// Allocations can be made from a marker immediately after it's created...
let first_marker = scratchpad.mark_front().unwrap();
let result = first_marker.allocate(3.14159f32);
assert!(result.is_ok());
{
// ...but creating a second marker will prevent any new allocations
// from being made using the first marker...
let second_marker = scratchpad.mark_front().unwrap();
let result = first_marker.allocate([1, 2, 3, 4, 5]);
assert!(result.is_err());
} // ...until the second marker is dropped (such as when going out of
// scope here).
let result = first_marker.allocate([1, 2, 3, 4, 5]);
assert!(result.is_ok());
// Creating a back marker does not prevent us from allocating from our
// front marker though.
let back_marker = scratchpad.mark_back().unwrap();
let result = first_marker.allocate([6, 7, 8, 9, 10]);
assert!(result.is_ok());
§Freeing Memory
Memory is freed only when the most recently created Marker
from a
given stack (“front” or “back”) is dropped. Despite this, markers are
still allowed to be dropped out-of-order; if a marker that is not the most
recently created marker from its stack is dropped, its memory will simply
be unusable until the more recently created markers are also freed. This
can fragment the memory pool, so it is recommended to drop markers in the
reverse order in which they are created when possible.
Dropping an Allocation
will not cause the scratchpad memory used to be
freed, but it will immediately call any Drop
implementation for the
allocated type. Allocations can also be dropped in any order.
Each Allocation
is bound to the lifetime of the Marker
from which
it is created, and each Marker
is bound to the lifetime of the
Scratchpad
from which it is created, ensuring allocations and markers
cannot outlive their parent objects.
§Additional Operations
Aside from general allocation and use of data, this crate provides support for a handful of additional operations:
- Allocations that are adjacent in memory in the same scratchpad can be
combined using
Allocation::concat()
and its unsafe counterpart,Allocation::concat_unchecked()
. - Data can be added to the most recent allocation from a given stack:
- Data can be appended to the most recent allocation from a front marker
using
MarkerFront::append()
,MarkerFront::append_clone()
, andMarkerFront::append_copy()
. - Data can be prepended to the most recent allocation from a back marker
using
MarkerBack::prepend()
,MarkerBack::prepend_clone()
, andMarkerBack::prepend_copy()
. - The
Marker
trait provides genericextend()
,extend_clone()
, andextend_copy()
methods that either append or prepend depending on whether the marker is a front or back marker.
- Data can be appended to the most recent allocation from a front marker
using
- Arbitrary numbers of slices can be concatenated into a single allocation
at once using
Marker::concat_slices()
,Marker::concat_slices_clone()
, andMarker::concat_slices_copy()
. - Allocations can be converted into slices using
Allocation::into_slice_like_allocation()
. - The data in an allocation can be moved out of the allocation using
Allocation::unwrap()
.
§Optional Crate Features
The following optional features can be set when building this crate:
std
: Implements various traits forBox
andVec
types, allowing boxed slices to be used as storage for allocations and marker tracking as well as both boxes and vectors to be used as slice sources forMarker
methods that take slices as parameters. Enabled by default; can be disabled to build the crate with#![no_std]
.alloc
: Implements the same support forBox
andVec
as thestd
feature, but uses thealloc
crate directly, allowing for use inno_std
code. Requires Rust 1.36.0 or later. Disabled by default.unstable
: Enables unstable Rust features (requires a nightly toolchain). Disabled by default. Enabling this feature includes:- Declaration of the function
Scratchpad::new()
asconst
. - Support for various features that were still unstable with legacy Rust
releases:
Box
andVec
support forno_std
code (enabled without theunstable
feature when using Rust 1.36.0 or later with thealloc
feature enabled).ByteData
trait implementations foru128
andi128
(enabled without theunstable
feature when using Rust 1.26.0 or later).ByteData
trait implementation for allstd::mem::MaybeUninit
types wrapping otherByteData
types (enabled without theunstable
feature when using Rust 1.36.0 or later).
- Declaration of the function
§Implementation Notes
§Memory Management
Scratchpad memory usage is explicitly controlled by the user. While this adds a bit to the complexity of setting up a scratchpad for use, it allows for scratchpads to be used under a variety of constraints, such as within low-memory embedded environments.
§Buffer Types
The backing data structures used for allocation storage and marker
tracking are specified by the generic parameters of the Scratchpad
type.
The allocation storage type is provided by the BufferT
generic
parameter, which is constrained by the Buffer
trait. Allocation
storage must be a contiguous region of memory, such as a static array,
boxed slice, or mutable slice reference. Array element types are
constrained to types that implement the ByteData
trait; by default,
this only includes basic integer types and the CacheAligned
struct
provided by this crate.
The marker tracking buffer type is provided by the TrackingT
generic
parameter, which is constrained by the Tracking
trait. This type must
allow storage and retrieval of a fixed number of usize
values. Unlike
Buffer
, there is no restriction on how the values are stored so long
as they can be set and subsequently retrieved by index. Any type
implementing Buffer
also implements Tracking
by default.
§Macros and Functions for Handling Buffers
Writing out the math needed to determine the number of elements needed for an array or boxed slice of a specific byte or marker capacity can be tedious and error-prone. This crate provides some macros and functions to help reduce the amount of work needed for declaring buffers based on their byte capacity or marker capacity:
array_type_for_bytes!()
andarray_type_for_markers!()
can be used to declare static array types based on their capacity.cache_aligned_zeroed_for_bytes!()
andcache_aligned_zeroed_for_markers!()
provide shorthand for creating static arrays ofCacheAligned
elements with their contents zeroed out.uninitialized_boxed_slice_for_bytes()
anduninitialized_boxed_slice_for_markers()
can be used to allocate memory for a boxed slice of a given capacity without initializing its contents.
Some lower level macros and functions are also available if needed:
array_len_for_bytes!()
andarray_len_for_markers!()
return the number of elements needed for a static array with a given capacity.cache_aligned_zeroed!()
provides shorthand for creating a singleCacheAligned
value with its contents zeroed out.uninitialized_boxed_slice()
allocates a boxed slice of a given number of elements without initializing its contents.
Additionally, all of the macros listed above evaluate to constant expressions when given constant expressions for input values.
§Data Alignment
The alignment requirements of allocated objects are handled by padding the offset of the allocation within the allocation buffer. This can result in some amount of wasted space if mixing allocations of types that have different alignment requirements. This waste can be minimized if necessary by grouping allocations based on their alignment requirements or by using separate scratchpads for different alignments.
§Buffer
and Tracking
Alignments
The alignment of the allocation buffer and marker tracking themselves are
determined by the element type of the corresponding slice or array used,
as specified by the generic parameters of the Scratchpad
type and the
parameters of Scratchpad::new()
. If the alignment of the buffer itself
doesn’t match the alignment needed for the first allocation made, the
allocation will be offset from the start of the buffer as needed,
resulting in wasted space.
To avoid this, use an element type for the buffer’s slice or array that
provides at least the same alignment as that which will be needed for
allocations. For most types, a slice or array of u64
elements should
provide sufficient alignment to avoid any initial wasted space.
Allocations that require non-standard alignments may require defining a
custom ByteData
type with an appropriate #[repr(align)]
attribute to
avoid any wasted space at the start of the allocation buffer. For example,
a Scratchpad
guaranteeing a minimum initial alignment of 16 bytes for
SIMD allocations can be created as follows:
use std::mem::MaybeUninit;
use scratchpad::{ByteData, Scratchpad};
#[repr(C, align(16))]
struct Aligned16([u32; 4]);
unsafe impl ByteData for Aligned16 {}
let scratchpad = Scratchpad::<
[MaybeUninit<Aligned16>; 1024],
[MaybeUninit<usize>; 4],
>::new(
unsafe { MaybeUninit::uninit().assume_init() },
unsafe { MaybeUninit::uninit().assume_init() },
);
§Alignment of Allocated Arrays
When allocating a dynamically sized array from a Marker
(i.e. using one of the allocate_array*()
or allocate_slice*()
methods), the array is only guaranteed to be aligned based on the
requirements of the element type. This means that, for example, it is
unsafe to use an array of u8
values as a buffer in which f32
values
will be written to or read from directly. It is strongly recommended that
you only use arrays allocated from a marker as the element type specified,
or that the array is allocated using an element type whose alignment is at
least as large as the data it will contain.
§Cache Alignment
Applications may prefer to keep data aligned to cache lines to avoid
performance issues (e.g. multiple cache line loads for data crossing cache
line boundaries, false sharing). The crate provides a custom data type,
CacheAligned
, that can be used as the backing type for both allocation
buffers and marker tracking. Simply providing an array or slice of
CacheAligned
objects instead of a built-in integer type will help
ensure cache alignment of the buffer.
This crate uses 64 bytes as the assumed cache line alignment, regardless
of the build target. While the actual cache line alignment can vary
between processors, 64 bytes is generally assumed to be a “safe” target.
This value is exported in the CACHE_ALIGNMENT
constant for
applications that wish to reference it.
The size of a single CacheAligned
element will always be the same as
the cache alignment, so buffers based on CacheAligned
will always be
a multiple of CACHE_ALIGNMENT
in size.
§Memory Overhead
Creating a marker requires a single usize
value (4 bytes on platforms
using 32-bit pointers, 8 bytes if using 64-bit pointers) within the
scratchpad’s tracking buffer. When using a slice or array for marker
tracking, this memory is allocated up-front, so the footprint of the
Scratchpad
does not change after the scratchpad is created.
Each Marker
instance itself contains a reference back to its
Scratchpad
and its index within the scratchpad. This comes out to 8
bytes on platforms with 32-bit pointers and 16 bytes on platforms with
64-bit pointers.
Individual allocations have effectively no overhead. Each Allocation
instance itself only contains a pointer to its data, whose size can vary
depending on whether a fat pointer is necessary (such as for dynamically
sized arrays allocated using one of the allocate_array*()
or
allocate_slice*()
marker methods).
§Limitations
- Due to a lack of support in Rust for generically implementing traits for
any size of a static array of a given type, traits that are implemented
for static array types such as
Buffer
andSliceSource
are only implemented for a limited number of array sizes. Mutable slice references and boxed slices do not have this restriction, so they can be used for unsupported array sizes if necessary. SliceSourceCollection
andSliceMoveSourceCollection
(used by the slice concatenation methods ofMarker
) only support tuples with up to 12 items (moststd
library traits implemented for tuples are also limited to 12 items as well).
§Mutability Notes
Scratchpad
uses internal mutability when allocating and freeing
memory, with allocation methods operating on immutable Scratchpad
and
Marker
references. This is necessary to cleanly allow for multiple
concurrent allocations, as Rust’s mutable borrowing restrictions would
otherwise prevent such behavior. Allocation and free operations do not
have any side effect on other existing allocations, so there are no
special considerations necessary by the user.
§Known Issues
-
This crate allows for working with uninitialized memory for allocation and tracking buffers, but predates the
MaybeUninit
type. To retain backwards-compatibility with earlier crate releases, array and slice types used for uninitialized memory are allowed to contain elements of anyByteData
type and not justMaybeUninit
-wrapped types.ByteData
types are restricted to the built-in integer types and some aggregates of those types, and while such values can safely store any possible pattern of bits without causing undefined behavior, the rules in Rust for using integers whose memory is specifically uninitialized have not been finalized. Until such rules have been finalized, it is recommended to useMaybeUninit
variants of these types if possible when working with uninitialized memory to avoid undefined behavior.Affected functions include:
-
IntoMutSliceLikePtr::into_mut_slice_like_ptr()
is not declared asunsafe
, as the implementations included by this crate don’t need to read the data pointed to by pointer given at this time (they can operate entirely using the pointer values themselves, so passing null pointers won’t cause a segmentation fault). This safety cannot be guaranteed, as future changes to Rust or additionalIntoMutSliceLikePtr
implementations may need to read the data; in particular,CStr
may be changed at some point to no longer compute and store the length as part of a fat pointer when created, so itsinto_mut_slice_like_ptr()
implementation would need to read the string itself in order to produce a valid slice).Adding
unsafe
is potentially a breaking change for anyone currently using theIntoMutSliceLikePtr
trait directly in their code, so this will not be addressed until the 2.0 release of this crate.This is only a safety issue that does not affect crate functionality or any code that does not use the
IntoMutSliceLikePtr
trait directly, so it can largely be ignored.
§Example - Temporary Thread-local Allocations
Applications can create per-thread scratchpads for temporary allocations using thread-local variables, reducing fragmentation of the global memory heap and improving performance. The following example shows how one might use temporary storage to interface with a third-party library within some abstraction layer:
#[macro_use]
extern crate scratchpad;
use scratchpad::{CacheAligned, Scratchpad};
use scratchpad::uninitialized_boxed_slice_for_bytes;
use std::mem::{MaybeUninit, size_of};
use std::os::raw::c_void;
/// Thread-local scratchpad size, in bytes (1 MB).
const THREAD_LOCAL_SCRATCHPAD_SIZE: usize = 1024 * 1024;
/// Maximum thread-local scratchpad allocation marker count.
const THREAD_LOCAL_SCRATCHPAD_MAX_MARKERS: usize = 8;
/// Buffer type for thread-local scratchpad allocations. By using
/// `CacheAligned`, we avoid false sharing of cache lines between threads.
type ThreadLocalScratchBuffer = Box<[MaybeUninit<CacheAligned>]>;
/// Buffer type for tracking of thread-local scratchpad markers (each
/// marker requires a `usize` value).
type ThreadLocalScratchTracking = array_type_for_markers!(
MaybeUninit<CacheAligned>,
THREAD_LOCAL_SCRATCHPAD_MAX_MARKERS,
);
thread_local! {
/// Thread-local scratchpad. The initial contents of the allocation
/// buffer and marker tracking buffer are ignored, so we can create
/// them as uninitialized.
pub static THREAD_LOCAL_SCRATCHPAD: Scratchpad<
ThreadLocalScratchBuffer,
ThreadLocalScratchTracking,
> = unsafe { Scratchpad::new(
uninitialized_boxed_slice_for_bytes(THREAD_LOCAL_SCRATCHPAD_SIZE),
MaybeUninit::uninit().assume_init(),
) };
}
/// Rust bindings for part of some fictional third-party API written in
/// C/C++.
pub mod libfoo {
use std::os::raw::c_void;
#[repr(C)]
pub enum SequenceType {
Integer,
Float,
Double,
}
#[derive(Debug, PartialEq)]
#[repr(C)]
pub enum SequenceResult {
Red,
Green,
Blue,
}
#[repr(C)]
pub struct SequenceParameters {
pub data_type: SequenceType,
pub data_count: usize,
pub data: *const c_void,
}
pub unsafe extern "C" fn process_sequences(
sequence_count: usize,
sequences: *const SequenceParameters,
) -> SequenceResult {
// ...
}
}
/// Our abstraction of `libfoo::process_sequences()`, in which we only
/// ever work with `f32` data.
pub fn process_float_sequences<I, E, S>(
sequences: I,
) -> Result<libfoo::SequenceResult, scratchpad::Error<()>>
where
I: IntoIterator<Item = S, IntoIter = E>,
E: ExactSizeIterator<Item = S>,
S: AsRef<[f32]>,
{
THREAD_LOCAL_SCRATCHPAD.with(|scratchpad| {
// We need an array of `libfoo::SequenceParameters` structs for
// the call to `libfoo::process_sequences()`.
let mut sequences = sequences.into_iter();
let sequences_len = sequences.len();
let marker = scratchpad.mark_front()?;
let foo_sequences = marker.allocate_array_with(
sequences_len,
|index| {
let data = sequences.next().unwrap();
let data_ref = data.as_ref();
libfoo::SequenceParameters {
data_type: libfoo::SequenceType::Float,
data_count: data_ref.len(),
data: data_ref.as_ptr() as *const c_void,
}
}
)?;
Ok(unsafe { libfoo::process_sequences(
sequences_len,
foo_sequences.as_ptr(),
) })
// The marker is dropped as it goes out of scope, freeing the
// allocated memory.
})
}
fn main() {
let sequence_a = [2.22f32, 9.99f32, -1234.56f32];
let sequence_b = [-1.0f32, 8.8f32, 27.0f32, 0.03f32];
let sequence_c = [88.0f32];
let sequences = [&sequence_a[..], &sequence_b[..], &sequence_c[..]];
assert_eq!(
process_float_sequences(&sequences).unwrap(),
libfoo::SequenceResult::Red,
);
}
Macros§
- Returns the minimum number of elements of a given type necessary for storage of a given byte count. The actual supported byte count may be larger due to padding.
- Returns the minimum number of elements of a given type necessary for tracking of at least the specified number of allocation markers. The actual supported marker count may be larger due to padding.
- Declares a static array of the specified element type that is large enough for storage of a given byte count. The actual supported byte count may be larger due to padding.
- Declares a static array of the specified element type that is large enough for storage of at least the specified number of allocation markers. The actual supported marker count may be larger due to padding.
- Creates a
CacheAligned
instance whose contents are zeroed-out. - Creates an array of zeroed-out
CacheAligned
values large enough for storage of the given byte count. The actual supported byte count may be larger due to padding. - Creates an array of zeroed-out
CacheAligned
values large enough for storage of at least the specified number of allocation markers. The actual supported marker count may be larger due to padding.
Structs§
- Scratchpad
Marker
allocation. - The error type for scratchpad operations.
Scratchpad
marker for allocations from the back of the allocation buffer.Scratchpad
marker for allocations from the front of the allocation buffer.- Stack-like dynamic memory pool with double-ended allocation support.
Enums§
- Categories of errors during scratchpad operations.
Constants§
- Assumed cache line byte alignment.
Traits§
- Trait for static arrays.
- Trait for
Scratchpad
buffer types. - Trait for types that can be safely used as the backing data type for storage of arbitrary data.
- Extension of the
SliceLike
trait used to mark DSTs for which concatenation can safely be performed by concatenating the underlying slice type. - Trait for reinterpreting a pointer to a given type as a compatible
SliceLike
pointer that uses the exact same backing memory. Scratchpad
allocation marker implementation trait.- Trait for dynamically sized types that wrap some slice type.
- Subtrait of
SliceSource
for taking ownership of the contents of a slice. - Subtrait of
SliceSourceCollection
for taking ownership of the contents of a collection of slice sources. - Trait for sources of slice data provided to
Marker
trait methods. - Trait for generic access to collections of
SliceSource
objects. Buffer
sub-trait for static arrays.- Trait for
Scratchpad
allocation tracking containers.
Functions§
- Returns the size of a type in bytes.
- Returns a boxed slice of a given length whose data is uninitialized.
- Returns a boxed slice of uninitialized data large enough to hold at least the specified number of bytes.
- Returns a boxed slice of uninitialized data large enough for tracking of at least the specified number of allocation markers.