Crate scapegoat[−][src]
Expand description
Ordered set and map data structures via an arena-based scapegoat tree (memory-efficient, self-balancing binary search tree).
- Safe:
#![forbid(unsafe_code)]
. - Embedded-friendly:
!#[no_std]
by default. - Validated via differential fuzzing, against the standard library’s
BTreeSet
andBTreeMap
.
About
Three APIs:
- Ordered Set API (
SGSet
) - subset ofBTreeSet
nightly. - Ordered Map API (
SGMap
) - subset ofBTreeMap
nightly. - Binary Tree API (
SGTree
) - backend for the previous two.
Strives for two properties:
-
Maximal safety: strong memory safety guarantees, hence
#![forbid(unsafe_code)]
.- Compile-time safety: no
unsafe
(no raw pointer dereference, etc.). - Debug-time safety:
debug_assert!
for logical invariants exercised in testing. - Runtime safety: no interior mutability (e.g. no need for
Rc<RefCell<T>>
’s runtime check).
- Compile-time safety: no
-
Minimal footprint: low resource use, hence
!#[no_std]
.- Memory-efficient: nodes have only child index metadata, node memory is re-used.
- Recursion-free: all operations are iterative, so stack use is fixed and runtime is minimized.
- Zero-copy: rebuild/removal re-point in-place, nodes are never copied or cloned.
Other features:
- Generic: map keys and set elements can be any type that implements the
Ord
trait. - Arbitrarily mutable: elements can be insert and removed, map values can be mutated.
Usage
SGMap
non-exhaustive, !#[no_std]
API example (would work identically for std::collections::BTreeMap
):
use scapegoat::SGMap;
use smallvec::{smallvec, SmallVec};
const REF_BUF_LEN: usize = 5;
let mut example = SGMap::new();
let mut stack_str = "your friend the";
// Insert "dynamically" (as if heap)
example.insert(3, "the");
example.insert(2, "don't blame");
example.insert(1, "Please");
example.insert(4, "borrow checker");
// Ordered reference iterator
assert!(example
.iter()
.map(|(_, v)| *v)
.collect::<SmallVec<[&str; REF_BUF_LEN]>>()
.iter()
.eq(["Please","don't blame","the","borrow checker"].iter()));
// Container indexing
assert_eq!(example[&3], "the");
// Fast (no search) head removal
let please_tuple = example.pop_first().unwrap();
assert_eq!(please_tuple, (1, "Please"));
// By-predicate removal (iterates all entries)
example.retain(|_, v| !v.contains("a"));
// Extension
let iterable: SmallVec<[(isize, &str); REF_BUF_LEN]> =
smallvec![(1337, "safety!"), (0, "Leverage"), (100, "for")];
example.extend(iterable.into_iter());
// Value mutation
if let Some(three_val) = example.get_mut(&3) {
*three_val = &mut stack_str;
}
// New message :)
assert!(example
.into_values()
.collect::<SmallVec<[&str; REF_BUF_LEN]>>()
.iter()
.eq(["Leverage","your friend the","borrow checker","for","safety!"].iter()));
Additional examples here.
Configuring a Stack Storage Limit
The maximum number of stack-stored elements (set) or key-value pairs (map/tree) is determined at compile-time, via the environment variable SG_MAX_STACK_ELEMS
.
Valid values include the range [0, 32]
and powers of 2 up to 1,048,576
.
For example, to store up to 2048
items on the stack:
export SG_MAX_STACK_ELEMS=2048
cargo build --release
Please note:
-
If the
SG_MAX_STACK_ELEMS
environment variable is not set, it will default to1024
. -
For any system with heap memory: the first
SG_MAX_STACK_ELEMS
elements are stack-allocated and the remainder will be automatically heap-allocated. -
For embedded systems with only stack memory:
SG_MAX_STACK_ELEMS
is a hard maximum - attempting to insert beyond this limit will cause a panic.- Use feature
high_assurance
to ensure error handling and avoid panic (see below).
- Use feature
Warning: Although stack usage is constant (no recursion), a stack overflow can happen at runtime if
SG_MAX_STACK_ELEMS
(configurable) and/or the stored item type (generic) is too large. Note stack overflow is distinct from buffer overflow (which safe Rust prevents). Regardless, you must test to ensure you don’t exceed the stack frame(s) size limit of your target platform. Rust only supports stack probes on x86/x64, although creative linking solutions have been suggested for other architectures.
For more advanced configuration options, see the documentation here.
The high_assurance
Feature
For embedded use cases prioritizing robustness (or kernelspace code), the high_assurance
feature can be enabled with the standard Cargo.toml
declaration:
[dependencies]
scapegoat = { version = "^1.7", features = ["high_assurance"] }
Enabling this feature makes two changes:
-
Front-end, API Tweak:
insert
andappend
APIs now returnResult
.Err
indicates stack storage is already at maximum capacity, so caller must handle. No heap use, no panic potential on insert. -
Back-end, Integer Packing: Because the fixed/max size of the stack arena is known, indexing integers (metadata stored at every node!) can be size-optimized. This memory micro-optimization honors the original design goals of the scapegoat data structure.
That second change is a subtle but interesting one. Example of packing saving 53% (31 KB) of RAM usage:
use core::mem::size_of;
use scapegoat::SGMap;
// If you're on a 64-bit system, you can compile-time check the below numbers yourself!
// Just do:
//
// $ cargo test --doc
// $ cargo test --doc --features="high_assurance"
//
// One command per set of `cfg` macros below.
// Internally, this compile-time struct packing is done with the `smallnum` crate:
// https://crates.io/crates/smallnum
// This code assumes `SG_MAX_STACK_ELEMS == 1024` (default)
let temp: SGMap<u64, u64> = SGMap::new();
let other_features_enabled = cfg!(any(feature = "fast_rebalance", feature = "low_mem_insert"));
if temp.capacity() == 1024 && (!other_features_enabled) {
// Without packing
#[cfg(target_pointer_width = "64")]
#[cfg(not(feature = "high_assurance"))] // Disabled
{
assert_eq!(size_of::<SGMap<u64, u64>>(), 57_440);
}
// With packing
#[cfg(target_pointer_width = "64")]
#[cfg(feature = "high_assurance")] // Enabled
{
assert_eq!(size_of::<SGMap<u64, u64>>(), 26_688);
}
}
Considerations
General Goals
This project is an exercise in safe, portable data structure design. The goal is to offer embedded developers familiar, ergonomic APIs on resource constrained systems that otherwise don’t get the luxury of dynamic collections. Without sacrificing safety.
scapegoat
is not as fast or mature as the standard library’s BTreeMap
/BTreeSet
(benchmarks via cargo bench
).
The standard library has been heavily optimized for cache performance.
This library is optimized for low, stack-only memory footprint.
It offers:
-
Best-effort Compatibility: APIs are a subset of
BTreeMap
’s/BTreeSet
’s, making it a somewhat “drop-in” replacement for!#[no_std]
systems. Please open an issue if an API you need isn’t yet supported. -
Dynamic Validation: Coverage-guided differential fuzzing is used to demonstrate that this implementation is logically equivalent and equally reliable.
-
Tunable Performance: A single floating point value optimizes relative performance of
insert
,get
, andremove
operation classes. And it can be changed at runtime.
Algorithmic Complexity
Space complexity is always O(n)
.
Operation | Average Case | Worst Case |
---|---|---|
get | O(log n) | O(log n) |
insert | O(log n) | Amortized O(log n) |
remove | O(log n) | Amortized O(log n) |
The low_mem_insert
and fast_rebalance
features can be used to fine-tune tradeoffs of memory usage and speed.
Memory Footprint Demos
-
Code size demo -
SGMap<usize, usize>
withinsert
,get
, andremove
called: 18.8KB for an x86-64 binary. Caveat: you’ll likely want to use more than 3 functions, resulting in more executable code getting included. -
Stack space demo -
SGMap<u8, u8>
with a 256 pair capacity: 2.6KB storage cost. Caveat: 2-3x more stack space is required for runtime book keeping (e.g. rebalancing).
Trusted Dependencies
This library has three dependencies, each of which have no dependencies of their own (e.g. exactly three total dependencies).
smallvec
-!#[no_std]
compatibleVec
alternative. Used in Mozilla’s Servo browser engine.micromath
-!#[no_std]
,#![forbid(unsafe_code)]
floating point approximations.smallnum
-!#[no_std]
,#![forbid(unsafe_code)]
integer packing.
License and Contributing
Licensed under the MIT license. Contributions are welcome!
Structs
An owning iterator over the keys of a SGMap
.
An owning iterator over the values of a SGMap
.
An iterator over the keys of a SGMap
.
Ordered map.
A wrapper interface for SGTree
.
API examples and descriptions are all adapted or directly copied from the standard library’s BTreeMap
.
Ordered set.
API examples and descriptions are all adapted or directly copied from the standard library’s BTreeSet
.
A memory-efficient, self-balancing binary search tree.
An iterator over the values of a SGMap
.
A mutable iterator over the values of a SGMap
.
Enums
Errors for fallible operations.