scry-index 0.1.0

A concurrent sorted key-value map backed by learned index structures
Documentation

scry-index

A concurrent sorted key-value map backed by learned index structures.

Uses piecewise linear models (LIPP chain method) to predict key positions, giving lock-free reads and CAS-based writes with no global lock. Based on the SALI paper (SIGMOD 2024).

Usage

use scry_index::LearnedMap;

let map = LearnedMap::new();
let m = map.pin();

m.insert(42u64, "the answer");
m.insert(1u64, "first");
m.insert(100u64, "last");

assert_eq!(m.get(&42), Some(&"the answer"));

for (k, v) in m.range(1u64..50) {
    println!("{k}: {v}");
}

All operations take &self — share across threads with Arc:

use std::sync::Arc;
use scry_index::LearnedMap;

let map = Arc::new(LearnedMap::new());

std::thread::scope(|s| {
    // Writers
    for t in 0..4 {
        let map = Arc::clone(&map);
        s.spawn(move || {
            let m = map.pin();
            for i in 0..1000u64 {
                m.insert(t * 1000 + i, i);
            }
        });
    }
    // Concurrent readers — no locks, no contention
    for _ in 0..8 {
        let map = Arc::clone(&map);
        s.spawn(move || {
            let m = map.pin();
            let _ = m.get(&42);
        });
    }
});

When to use scry-index

Read-heavy concurrent workloads with sorted keys — time-series queries, lookup tables, caches, analytics indexes, in-memory stores where multiple threads read and a few threads write. See Design trade-offs for where other structures are a better fit.

Benchmarks

100K sequential keys, bulk-loaded. Measured on the same machine; see examples/simulate.rs for the full workload.

Point lookups (single-threaded)

Operation LearnedMap BTreeMap SkipMap vs BTreeMap vs SkipMap
Sequential keys 0.6 ms 2.8 ms 4.5 ms 4.7x faster 7.6x faster
Random keys 0.9 ms 5.4 ms 15.6 ms 6.2x faster 18x faster

Concurrent (100K keys, 8 reader + 4 writer threads)

Operation LearnedMap RwLock<BTreeMap> SkipMap vs BTreeMap vs SkipMap
8T point reads 1.6 ms 9.8 ms 9.2 ms 6.2x faster 5.9x faster
4T writes 1.5 ms 3.4 ms 2.3 ms 2.2x faster 1.5x faster
8R + 4W mixed 3.6 ms 27.3 ms 11.5 ms 7.5x faster 3.2x faster

Writes (single-threaded)

Operation LearnedMap BTreeMap
Incremental insert (10K into 100K) 0.2 ms 0.3 ms 1.5x faster
Bulk load 1.4 ms 0.7 ms 1.9x slower
cargo run --example simulate --release      # full 3-way workload comparison
cargo run --example mini_bench --release     # single-threaded breakdown
cargo bench                                  # criterion microbenchmarks

Design trade-offs

These are structural properties of concurrent learned indexes (SALI, LIPP, ALEX), not implementation gaps.

O(1) lookups vs O(log n). The model-predicted lookup advantage grows with scale and is largest under random access, where B-tree variants suffer cache misses at depth. This is the core benefit of learned indexes.

Range scans are slower than BTreeMap (~30x for pure sequential scans). The LIPP chain method places keys at model-predicted slots across a tree of nodes; in-order traversal is a DFS with pointer chasing. BTreeMap's contiguous sorted leaves make range scans essentially sequential memory reads. This is the fundamental trade-off of the chain method — it is what enables lock-free concurrent inserts without data shifting. Under concurrent read/write workloads the gap narrows significantly because BTreeMap requires a write lock while LearnedMap readers never block.

Random-key insert is slower than BTreeMap (~3-5x from empty). Keys that don't match the node's linear model collide into the same slot, creating deeper chains until a localized rebuild fires. All learned indexes assume some structure in the key distribution — sequential, time-series, and bulk-loaded keys fit the model well. Use bulk_load when data is pre-sorted.

Per-operation epoch overhead. Lock-free reclamation (crossbeam-epoch) adds a small fixed cost to every operation. This is shared by all epoch-based concurrent structures (including SkipMap). The cost is amortized under actual concurrency — single-threaded insert is ~1.3x slower than BTreeMap, but concurrent insert is 2.2x faster.

Features

  • Lock-free reads via epoch-based reclamation
  • CAS-based writes, per-slot contention only
  • Sorted iteration and range queries (a..b, a..=b, a.., ..b, ..)
  • Bulk loading from sorted data (bulk_load, bulk_load_dedup)
  • Depth-triggered localized subtree rebuilds
  • Tombstone compaction for remove-heavy workloads
  • get_or_insert / get_or_insert_with entry API
  • clear() / drain() for bulk removal
  • allocated_bytes() for memory introspection
  • LearnedSet wrapper for key-only use
  • Optional serde support (features = ["serde"])

Key types

Integers, [u8; N], String, and Vec<u8> implement Key out of the box. Implement the Key trait for custom types.

Configuration

use scry_index::{Config, LearnedMap};

let config = Config::new()
    .expansion_factor(2.5)
    .rebuild_depth_threshold(10)
    .tombstone_ratio_threshold(0.3);

let map: LearnedMap<u64, String> = LearnedMap::with_config(config);

See Config docs for defaults and details.

Minimum supported Rust version

1.83.0

License

MIT OR Apache-2.0