bplus-index 0.1.1

Arena-backed B+Tree for in-memory sorted indexes. Zero unsafe, cache-friendly, supports non-unique keys.
Documentation

Crate docs.rs Build

bplus-index

Arena-backed B+Tree for in-memory sorted indexes in Rust.

Designed for data-oriented architectures where you need secondary indexes on tables backed by slotmaps or ECS-style storage. Supports non-unique keys natively -- the (K, V) pair is the unique identity of each entry.

Features

  • Arena storage -- all nodes live in contiguous Vecs indexed by u32. No Box, no heap allocation per node, no pointer chasing between siblings.
  • Cache-friendly -- each leaf holds up to 64 entries in inline ArrayVecs. Range scans follow a doubly-linked leaf chain with sequential memory access.
  • Non-unique keys -- first-class support for one-to-many and many-to-many secondary indexes. Inner node separators store (K, V) for correct routing.
  • Zero unsafe -- built on arrayvec. No raw pointer manipulation.
  • Minimal API -- insert, delete, update, lookup_unique, lookup_range, range, iter. One struct covers unique, 1:N, and N:N indexes depending on the choice of K and V.

Complexity

Operation Time
insert O(log n)
delete O(log n)
update O(log n)
lookup_unique O(log n)
lookup_range O(log n + k)
range O(log n + k)
iter O(n)

Where n is the total number of entries and k is the number of results.

Usage

[dependencies]
bplus-index = "0.1"

Unique index

use bplus_index::BPlusIndex;

let mut idx = BPlusIndex::<u64, u32>::new();
idx.insert(42, 1001); // key=42, row_id=1001

assert_eq!(idx.lookup_unique(&42), Some(&1001));

Non-unique 1:N index

use bplus_index::BPlusIndex;

// One score -> many player ids
let mut idx = BPlusIndex::<i32, u64>::new();
idx.insert(1500, 1);
idx.insert(1500, 2);
idx.insert(1500, 3);

let players: Vec<_> = idx.lookup_range(&1500).copied().collect();
assert_eq!(players, vec![1, 2, 3]);

Composite key for N:N index

use bplus_index::BPlusIndex;

// (game_mode, mmr) -> ticket_id
let mut idx = BPlusIndex::<(u8, i32), u64>::new();
idx.insert((1, 1200), 100);
idx.insert((1, 1350), 101);
idx.insert((1, 1400), 102);
idx.insert((2, 1300), 103);

// Range scan: mode 1, mmr 1200..1400
let tickets: Vec<_> = idx.range((1, 1200)..(1, 1400))
    .map(|(_, id)| *id)
    .collect();
assert_eq!(tickets, vec![100, 101]);

Update (re-index)

use bplus_index::BPlusIndex;

let mut idx = BPlusIndex::<i32, u64>::new();
idx.insert(1500, 42);

// Player 42's score changed from 1500 to 1600
idx.update(&1500, 1600, 42);

assert_eq!(idx.lookup_unique(&1500), None);
assert_eq!(idx.lookup_unique(&1600), Some(&42));

Benchmarks

Compared against std::collections::BTreeMap with i64 keys and u64 values, random insertion order.

Measured on Intel Core i7-10870H (8C/16T, 2.20 GHz) running Fedora Linux 43. Run with cargo bench.

Operation Size BPlusIndex BTreeMap Speedup
insert 1K 240 us 338 us 1.4x
insert 10K 3.3 ms 4.8 ms 1.4x
insert 100K 42 ms 68 ms 1.6x
lookup 1K 59 us 91 us 1.5x
lookup 10K 969 us 1.98 ms 2.0x
lookup 100K 16.4 ms 31.9 ms 1.9x
range scan 100K 402 us 1.32 ms 3.3x
delete 1K 228 us 232 us ~1x
delete 10K 3.1 ms 3.2 ms ~1x
delete 100K 43 ms 49 ms 1.1x
iter 100K 338 us 1.01 ms 3.0x

Range scans and full iteration show the largest gains (3x) thanks to the linked leaf chain and sequential memory layout.

Lookups benefit from the flat arena reducing cache misses during descent (2x at scale).

Deletes are roughly on par since rebalancing logic dominates over memory access patterns

Design notes

Why not BTreeMap?

BTreeMap is a B-Tree (not B+Tree) -- values are stored in both leaf and internal nodes, and there is no leaf chain. Range scans require climbing back up the tree to visit the next subtree. Each node is a separate heap allocation, so sibling traversal involves pointer chasing.

BPlusIndex stores values only in leaves, which are linked for O(1) sibling traversal. Nodes live in flat Vecs for better memory locality.

Why (K, V) separators in inner nodes?

Standard B+Trees use K-only separators, which breaks down when the same key appears in multiple leaves. A query for key K would descend into one subtree and miss entries in adjacent subtrees. By using (K, V) as the separator, the routing is always unambiguous regardless of key cardinality.

Node recycling

Deleted nodes are not deallocated. Their indices are pushed onto a free list and reused by subsequent allocations. This avoids Vec fragmentation and keeps arena indices stable.

License

Licensed under either of

at your option.