masstree
A high-performance concurrent ordered map for Rust. It stores keys as &[u8] and supports variable-length keys by building a trie of B+trees, based on the Masstree paper
Disclaimer: This is an independent implementation. It is not endorsed by, affiliated with, or connected to the original Masstree authors or their institutions.
Features
- Ordered map for byte keys (lexicographic ordering)
- Lock-free reads with version validation
- Concurrent inserts and deletes with fine-grained leaf locking
- Zero-copy range scans with
scan_refandscan_prefix - High-throughput value-only scans with
scan_values(skips key materialization) - Memory reclamation via hyaline scheme (
seizecrate) - Lazy leaf coalescing for deleted entries
- Two node widths:
MassTree(WIDTH=24) andMassTree15(WIDTH=15) - Extremely high-performance inline variant
MassTree15Inline, this is only usable on Copy types.
Status
v0.6.0 — Major performance enhancements. Core feature complete. Beats C++ Masstree on 5/7 benchmarks and Rust alternatives on 11/12 workloads. Passes Miri with strict-provenance flag. Concurrent data structures require extensive stress testing, the test suite is comprehensive (584 lib tests + stress tests (984 tests total)) but edge cases may remain.
| Feature | Status |
|---|---|
get, get_ref |
Lock-free with version validation |
insert |
Fine-grained leaf locking |
remove |
Concurrent deletion with memory reclamation |
scan, scan_ref, scan_prefix |
Zero-copy range iteration |
scan_values, scan_values_rev |
High-throughput value-only scans |
DoubleEndedIterator |
Reverse iteration support |
| Leaf coalescing | Lazy queue-based cleanup |
| Memory reclamation | Hyaline scheme via seize crate |
Not yet implemented: Entry API, Extend/FromIterator.
vs C++ Masstree (12T, 10s)
Results vary between runs and hardware configurations. The same benchmark
(10 hot keys, 12 threads) consistently shows the largest improvement over C++,
achieving 1.7x higher throughput under extreme contention. Possible factors:
- Hyaline memory reclamation — Unlike the C++ epoch-based reclamation (EBR),
Hyaline (via
seizecrate) allows readers to avoid quiescent state registration - Lazy coalescing — Empty leaves are queued for deferred cleanup rather than removed inline, avoiding lock-coupling issues during removes
- Sharded length counter — 16 cache-line-aligned shards for
len()tracking (C++ doesn't track global count)
Note: The optimistic read protocol (version-based OCC) is the original Masstree design,
not a divergence. One minor divergence: has_changed() uses > (LOCK_BIT | INSERTING_BIT)
instead of C++'s > lock_bit, ignoring both bits 0-1. This is safe because version
counters (VINSERT/VSPLIT) are the source of truth, INSERTING_BIT is only set while
modifications are in-flight and not yet visible to readers. See src/nodeversion.rs:643-673
for the full safety argument.
The forward-sequential gap (rw3) narrowed from 57% to 81% but remains under investigation.
| Benchmark | Rust | C++ | Ratio | Winner |
|---|---|---|---|---|
| rw4 (reverse-seq) | 59.00 | 48.14 | 123% | Rust |
| same (10 hot keys) | 3.56 | 2.09 | 170% | Rust |
| rw2g98 (98% reads) | 25.81 | 23.04 | 112% | Rust |
| uscale (random 140M) | 11.05 | 10.58 | 104% | Rust |
| wscale (wide random) | 9.56 | 9.03 | 106% | Rust |
| rw1 (random insert+read) | 11.01 | 11.23 | 98% | Tie |
| rw3 (forward-seq) | 40.54 | 50.34 | 81% | C++ |
vs Rust Concurrent Maps (6T Physical, Rigorous)
Source:
runs/run136_read_write.txtConfig: Physical cores only, 200 samples, performance governor.
This can be considered the current baseline.
Note: MassTree's insert() has upsert semantics, it updates existing keys and returns the old
value. TreeIndex's insert() fails on existing keys, requiring a remove()+insert() fallback.
Pure insert benchmarks (13, 14) use fresh keys only, providing a fairer comparison for
insert-heavy workloads where TreeIndex performs better.
NOTE 2: Recent optimizations (batched timeout checks, hybrid spin-yield backoff, countdown-based guard
refresh) improved throughput by +18% on average. Notable gains: zipfian +30%, mixed 50/50 +34%,
pure read +28%. The insert_only_fair benchmark flipped from a loss (0.92x) to a win (1.14x).
| Benchmark | masstree15 | tree_index | skipmap | indexset | MT vs Best |
|---|---|---|---|---|---|
| 01_uniform | 28.13 | 15.44 | 9.49 | 12.66 | 1.82x |
| 02_zipfian | 33.98 | 11.73 | 10.58 | 4.90 | 2.90x |
| 03_shared_prefix | 18.62 | 8.74 | 8.83 | 12.92 | 1.44x |
| 04_high_contention | 65.24 | 15.82 | 13.07 | 3.48 | 4.12x |
| 05_large_dataset | 13.02 | 9.33 | 6.87 | 7.88 | 1.40x |
| 06_single_hot_key | 20.19 | 4.44 | 6.01 | 3.82 | 3.36x |
| 07_mixed_50_50 | 24.74 | 5.69 | 5.18 | 12.09 | 2.05x |
| 08_8byte_keys | 44.71 | 23.41 | 11.83 | 15.97 | 1.91x |
| 09_pure_read | 42.52 | 22.69 | 14.19 | 13.77 | 1.87x |
| 10_remove_heavy | 21.70 | 11.57 | 5.50 | 3.68 | 1.88x |
| 13_insert_only_fair | 22.53 | 19.68 | 11.32 | 5.64 | 1.14x |
| 14_pure_insert | 9.30 | 12.43 | 6.35 | 2.30 | 0.75x |
Single-thread latency: masstree15 achieves 864 µs median read latency vs tree_index 1.31 ms (1.52x faster).
Build time: masstree15 builds at 8.56 Mitem/s vs skipmap 6.33, tree_index 4.46, indexset 1.85 (1.35–4.6x faster).
Range Scans (6T Physical, Rigorous)
Source:
runs/run139_range_scan_optimized.txt(inline-optimized) Config: Physical cores only, 100 samples, performance governor
| Benchmark | masstree15_inline | tree_index | MT vs TI | vs run137 |
|---|---|---|---|---|
| 01_sequential_full_scan | 32.29 | 15.37 | 2.10x | +79% |
| 02_reverse_scan | 23.50 | 15.31 | 1.53x | +34% |
| 03_clustered_scan | 31.61 | 15.25 | 2.07x | +76% |
| 04_sparse_scan | 32.19 | 11.60 | 2.77x | +79% |
| 05_shared_prefix_scan | 26.38 | 12.97 | 2.03x | +59% |
| 06_suffix_differ_scan | 15.86 | 12.16 | 1.30x | -30% |
| 07_hierarchical_scan | 17.43 | 11.95 | 1.46x | +56% |
| 08_adversarial_splits | 18.75 | 6.71 | 2.79x | +4% |
| 09_interleaved_scan | 16.27 | 9.51 | 1.71x | -2% |
| 10_blink_stress_scan | 20.75 | 9.81 | 2.11x | +16% |
| 11_random_keys_scan | 20.85 | 10.80 | 1.93x | +17% |
| 12_long_keys_64b_scan | 19.21 | 12.46 | 1.54x | +94% |
| 15_full_scan_aggregate | 1.70 G | 1.04 G | 1.64x | — |
| 16_insert_heavy | 22.60 | 15.93 | 1.42x | new |
| 17_hot_spot | 6.03 | 14.87 | 0.41x | new |
Install
[]
= { = "0.6.0", = ["mimalloc"] }
MSRV is Rust 1.92+ (Edition 2024).
The mimalloc feature sets the global allocator. If your project already uses a custom allocator, omit this feature.
Quick Start
use MassTree;
let tree: = new;
let guard = tree.guard;
// Insert
tree.insert_with_guard.unwrap;
tree.insert_with_guard.unwrap;
// Point lookup
assert_eq!;
// Remove
tree.remove_with_guard.unwrap;
assert_eq!;
// Range scan (zero-copy)
tree.scan_ref;
// Prefix scan
tree.scan_prefix;
Ergonomic APIs
For simpler use cases, auto-guard versions create guards internally:
use MassTree;
let tree: = new;
// Auto-guard versions (simpler but slightly more overhead per call)
tree.insert.unwrap;
tree.insert.unwrap;
assert_eq!;
assert_eq!;
assert!;
tree.remove.unwrap;
Range Iteration
use ;
let tree: = new;
let guard = tree.guard;
// Populate
for i in 0..100u64
// Iterator-based range scan
for entry in tree.range
// Full iteration
for entry in tree.iter
When to Use
May work well for:
- Long keys with shared prefixes (URLs, file paths, UUIDs)
- Range scans over ordered data
- Mixed read/write workloads
- High-contention scenarios (the trie structure helps here)
Consider alternatives for:
- Unordered point lookups →
dashmap - Pure insert-only workloads →
scc::TreeIndex - Integer keys only →
congee(ART-based) - Read-heavy with rare writes →
RwLock<BTreeMap>
Variant Selection
Two variants are provided with different performance characteristics:
| Variant | Best For |
|---|---|
MassTree15 |
Range scans, writes, shared-prefix keys, contention |
MassTree (WIDTH=24) |
Random-access reads, single-threaded point ops |
MassTree15 tends to perform better in our benchmarks due to cheaper u64 atomics and better cache utilization. Consider it for most workloads unless you have uniform random-access patterns.
use ;
// Default: WIDTH=24, Arc-based storage
let tree: = new;
// WIDTH=15, Arc-based storage (recommended for most workloads)
let tree15: = new;
// Inline storage for Copy types (no Arc overhead)
let inline: = new;
let inline15: = new;
How It Works
Masstree splits keys into 8-byte chunks, creating a trie where each node is a B+tree:
Key: "users/alice/profile" (19 bytes)
└─ Layer 0: "users/al" (8 bytes)
└─ Layer 1: "ice/prof" (8 bytes)
└─ Layer 2: "ile" (3 bytes)
Keys with shared prefixes share upper layers, making lookups efficient for hierarchical data.
Examples
The examples/ directory contains comprehensive usage examples:
Rayon Integration
MassTree works seamlessly with Rayon for parallel bulk operations:
use MassTree15Inline;
use *;
use Arc;
let tree: = new;
// Parallel bulk insert (~10M ops/sec)
.into_par_iter.for_each;
// Parallel lookups (~45M ops/sec)
let sum: u64 = .into_par_iter
.map
.sum;
Tokio Integration
MassTree is thread-safe but guards cannot be held across .await points:
use MassTree15;
use Arc;
let tree: = new;
// Spawn async tasks that share the tree
let handle = spawn;
// For CPU-intensive operations, use spawn_blocking
let tree_clone = clone;
spawn_blocking.await;
Crate Features
mimalloc— Use mimalloc as global allocator (recommended)tracing— Enable structured logging tologs/masstree.jsonl
License
MIT. See LICENSE.