masstree

A high-performance concurrent ordered map for Rust.

The Problem

Rust has dashmap for concurrent hash maps (millions of downloads/month), but no high-perm concurrent ordered map crates. The current options are (to my knowledge):

Approach	Problem
`Mutex<BTreeMap>`	Serializes all access
`RwLock<BTreeMap>`	Reader contention on lock acquisition
`crossbeam-skiplist`	Poor cache locality, slower than B+trees

MassTree fills this gap with a cache-efficient B+tree that scales.

Benchmark Highlights

Two benchmark suites: lock_comparison (vs Mutex/RwLock<BTreeMap>) and concurrent_maps (vs crossbeam-skiplist).

Benchmark Caveats (Why The Gap Can Look Huge)

Most read-throughput numbers use MassTree::get_ref(..., &guard) (no Arc clone) with a guard pinned once per thread, this is a best-case for Masstree’s design.
Mutex<BTreeMap> / RwLock<BTreeMap> benchmarks take a lock per operation (as you would with a shared map), and use the standard library locks (not parking_lot).
crossbeam-skiplist::SkipMap does not expose a reusable epoch guard in its high-level API, many operations pin/unpin internally per call. For a closer “same guard model” comparison, use crossbeam_skiplist::base::SkipList with an explicitly pinned guard.
Current node allocation tracking (SeizeAllocator) uses a mutex on allocatio/deallocation, most read-heavy benchmarks don’t allocate nodes (and many “write” mixes are updates), so this overhead won’t show up unless you benchmark split-heavy growth.
Concurrent-write results should be treated as being "WRONG" until remaining write-correctness bug (missing keys under contention) is fixed.

Where MassTree Does Well

Concurrent read throughput (32 threads):

Structure	Throughput	vs MassTree
MassTree	140.7 Mitem/s	—
RwLock	36.1 Mitem/s	3.9x slower
Mutex	4.0 Mitem/s	35x slower

Long keys with deep shared prefixes (16 common bytes), concurrent reads at 16 threads:

Structure	Median	vs MassTree
MassTree	1.46 ms	—
RwLock	3.13 ms	2.1x slower
Mutex	34.8 ms	24x slower

Single-threaded lookups scale better with key length:

Key Size	MassTree	RwLock	Winner
32B (no prefix)	88.5 µs	105.0 µs	MassTree +19%
24B (deep prefix)	87.6 µs	138.6 µs	MassTree +58%
32B (deep prefix)	97.5 µs	116.5 µs	MassTree +19%

Where MassTree Is Slower

Short keys (≤16B), single-threaded lookups:

Key Size	MassTree	RwLock	Difference
8B	70.8 µs	60.5 µs	MassTree 17% slower
16B	82.2 µs	65.2 µs	MassTree 26% slower

The trie structure adds overhead for keys that fit in a single 8-byte slice. This is inherent to the algorithm—each lookup must traverse the trie even for short keys.

Mixed read/write workloads with Zipfian distribution:

Threads	MassTree	SkipMap	Difference
1	9.75 ms	1.91 ms	MassTree 5x slower
16	20.85 ms	6.00 ms	MassTree 3.5x slower

Write-heavy hot spots expose layer creation overhead for long keys.

Concurrent Scaling Comparison

Read scaling from 1 to 32 threads:

Threads	MassTree	RwLock	Mutex
1	1.26 ms	1.20 ms	1.59 ms
8	2.19 ms	3.43 ms	26.0 ms
16	2.94 ms	5.60 ms	53.0 ms
32	5.21 ms	9.77 ms	109 ms

MassTree's optimistic reads avoid lock acquisition entirely—readers never block each other.

Summary

This should be possible, after it is properly implemented with all features and rigorously stress tested for correctness and subtle concurrency issues (there's currently one that's getting actively worked on and is blocking concurrent writes, but I have concrete plan in mind to fix it). I WOULDN'T RECOMMEND USING THIS CRATE FOR ANYTHING CURRENTLY.

Use Case	Recommendation
Short keys (≤16B), single-threaded	Use `BTreeMap`
Long keys (24B+), any thread count	Use MassTree
High read concurrency (8+ threads)	Use MassTree
Write-heavy Zipfian workloads	Use `crossbeam-skiplist`
Mixed workloads, shared key prefixes	Use MassTree

Status: Alpha

Not production ready at all. Core functionality works; APIs may change.

Feature	Status
Concurrent get	Lock-free, version-validated
Concurrent insert	Implemented (CAS fast path + locked fallback), but not yet correct under heavy contention (known “missing keys” bug; fix in progress via permutation freezing)
Split propagation	Leaf and internode splits
Memory safety	Miri strict provenance
Range scans	Planned
Deletion	Planned
Full memory reclamation	Planned (using `seize`)

305 tests passing (unit, integration, property, loom, shuttle)

Quick Start

use masstree::MassTree;
use std::sync::Arc;
use std::thread;

let tree = Arc::new(MassTree::<u64>::new());

let handles: Vec<_> = (0..8).map(|i| {
    let tree = Arc::clone(&tree);
    thread::spawn(move || {
        let guard = tree.guard();

        // Insert (fine-grained locking, not serialized)
        tree.insert_with_guard(b"user:1000", i, &guard).unwrap();

        // Read (lock-free, returns &V)
        if let Some(value) = tree.get_ref(b"user:1000", &guard) {
            println!("Got: {}", *value);
        }
    })
}).collect();

for h in handles { h.join().unwrap(); }

Use mimalloc for Best Performance

MassTree's allocation pattern benefits significantly from mimalloc:

#[global_allocator]
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;

Single-threaded improvement: 17% faster. The gap widens under contention.

How It Works

MassTree implements the Masstree algorithm (Mao, Kohler, Morris — EUROSYS 2012):

Trie of B+trees: Keys are split into 8-byte slices. Each slice navigates a B+tree layer. Longer keys chain through layers.

Key: "hello world!"

Layer 0: "hello wo" → B+tree lookup
                ↓
Layer 1: "rld!\0\0\0\0" → B+tree lookup → Value

Optimistic concurrency: Readers check version numbers before and after reading. If a concurrent write occurred, they retry. No locks acquired for reads.

Fine-grained writes: Writers lock only the target leaf node. CAS fast-path for simple inserts avoids locking entirely when possible.

B-link trees: Split operations use sibling pointers for lock-free traversal, allowing reads to proceed during structural modifications.

Memory Reclamation: The NodeAllocator trait abstracts memory management, designed for integration with seize's hyaline-based reclamation scheme.

Limitations

No deletion — keys cannot be removed (planned)
No range scans — ordered iteration not yet implemented (planned)
Arena reclamation — nodes freed only when tree drops
Key length — max 256 bytes, longer keys panic
Long key insert cost — multi-layer creation adds overhead vs skip lists

Running Benchmarks

# Recommended: use mimalloc
cargo bench --bench concurrent_maps --features mimalloc
cargo bench --bench lock_comparison --features mimalloc

# Compare against standard allocator
cargo bench --bench concurrent_maps

# Run tests
cargo test

# Miri (requires nightly)
cargo +nightly miri test

Contributing

This project needs help with:

Permutation Freeze - Currently working this. Please check the open issues.
Deletion + memory reclamation — integrate seize for safe concurrent deletes
Range scans — ordered iteration like BTreeMap::range()
Benchmarking — more workloads, different hardware, methodology validation
Stress testing — adversarial concurrent patterns

References

Disclaimer

Independent Rust implementation of the Masstree algorithm. Not affiliated with or endorsed by the original authors.

License

MIT

masstree 0.1.3