fastskip

Lock-free arena-backed skip list for LSM-tree memtables. Designed for high-throughput write workloads in storage engines like RocksDB, LevelDB, and CockroachDB.

Why fastskip?

Feature	Description
Lock-free writes	Multiple threads insert/delete concurrently via single CAS at level 0
Lock-free reads	Point lookups walk the skip list without any locks
O(1) allocation	Per-thread arena shards — zero-contention bump allocation
Snapshot isolation	Point-in-time snapshots remain consistent under concurrent writes
Range cursors	Cursor with lower-bound seek for prefix/range iteration
Seal/freeze lifecycle	`seal()` freezes memtable for flushing, returns fresh one
Bulk deallocation	Dropping memtable reclaims all arena memory at once
Auto-seal thresholds	Automatic sealing based on memory usage or entry count
Backpressure detection	Detect when nearing capacity limits
Batch operations	Efficient bulk insert and multi-get operations

Quick Start

Basic Operations

use fastskip::ConcurrentSkipList;

let sl = ConcurrentSkipList::new();

sl.insert(b"user:1001", b"alice");
sl.insert(b"user:1002", b"bob");

let (val, tombstone) = sl.get(b"user:1001").unwrap();
assert_eq!(val, b"alice");
assert!(!tombstone);

With Auto-Seal Thresholds

use fastskip::ConcurrentSkipList;

// Create with auto-seal at 1MB memory or 100k entries
let sl = ConcurrentSkipList::with_capacity_and_shards(
    64 * 1024,      // 64KB initial arena per shard
    4,              // 4 shards (should match writer thread count)
    1 * 1024 * 1024, // Auto-seal at 1MB total memory
    100_000         // Auto-seal at 100k entries
);

// Check if nearing limits
if !sl.is_under_backpressure() {
    // Safe to insert batch
    let batch = vec![(b"key1", b"val1"), (b"key2", b"val2")];
    sl.insert_batch(&batch).unwrap();
}

// Get memory statistics
println!("Used: {} bytes", sl.memory_usage());
println!("Utilization: {:.1}%", sl.memory_utilization() * 100.0);
println!("Avg key size: {} bytes", sl.avg_key_size() as usize);

Delete / Tombstones

Delete marks a key with a tombstone. Deleted keys remain in the skip list until flushed to SSTable.

use fastskip::ConcurrentSkipList;

let sl = ConcurrentSkipList::new();
sl.insert(b"key", b"value");

// Delete returns true if key was found
assert!(sl.delete(b"key"));

// get still returns the key, but tombstone = true
let (_, tombstone) = sl.get(b"key").unwrap();
assert!(tombstone);

// get_live filters out tombstones
assert_eq!(sl.get_live(b"key"), None);

LSM Memtable Lifecycle

The intended lifecycle for an LSM-tree storage engine:

use fastskip::ConcurrentSkipList;

// 1. Create active memtable with auto-seal thresholds
let memtable = ConcurrentSkipList::with_capacity_and_shards(
    10 * 1024 * 1024, // 10 MB per shard
    4,                // 4 shards
    50 * 1024 * 1024, // Auto-seal at 50 MB total
    1_000_000         // Auto-seal at 1M entries
);

// 2. Concurrent writers insert/delete
memtable.insert(b"key1", b"val1");
memtable.delete(b"key2");

// 3. Check for backpressure before inserting batch
if !memtable.is_under_backpressure() {
    let batch = vec![(b"key3", b"val3"), (b"key4", b"val4")];
    memtable.insert_batch(&batch).unwrap();
}

// 4. When full (auto-sealed or manual), seal it — returns frozen (for flush) + fresh (for writes)
let (frozen, fresh) = memtable.seal().unwrap();

// 5. Flush frozen to SSTable (iterate snapshot)
for entry in frozen.iter() {
    if entry.is_tombstone {
        // write tombstone marker to SSTable
    } else {
        // write key-value to SSTable
    }
}

// 6. Drop frozen (reclaims arena memory)
std::mem::drop(frozen);

// 7. Fresh memtable ready for new writes
fresh.insert(b"key5", b"val5");

// Monitor memory usage
println!("Memory usage: {} bytes", fresh.memory_usage());
println!("Utilization: {:.1}%", fresh.memory_utilization() * 100.0);

Snapshots — Point-in-Time Iteration

Readers are fully lock-free. For consistent reads under concurrent writes, use a snapshot:

use fastskip::ConcurrentSkipList;

let sl = ConcurrentSkipList::new();
sl.insert(b"a", b"1");
sl.insert(b"b", b"2");

// Snapshot captures sequence number — skips post-snapshot inserts
let snap = sl.snapshot();

// Insert more after snapshot (won't appear in snapshot)
sl.insert(b"c", b"3");

// Snapshot sees only "a" and "b"
assert_eq!(snap.iter().count(), 2);
// Live iterator sees all three
assert_eq!(sl.iter().count(), 3);

Range Scans

use fastskip::ConcurrentSkipList;

let sl = ConcurrentSkipList::new();
sl.insert(b"apple", b"1");
sl.insert(b"banana", b"2");
sl.insert(b"cherry", b"3");
sl.insert(b"date", b"4");

// Seek to first key >= "banana"
let mut cursor = sl.cursor_at(b"banana").unwrap();
while let Some(entry) = cursor.next_entry() {
    println!("{:?}", entry.key);
}
// Output: banana, cherry, date

Memory Stats

use fastskip::ConcurrentSkipList;

let sl = ConcurrentSkipList::new();
sl.insert(b"key", b"value");

println!("allocated: {} bytes", sl.memory_usage());
println!("reserved: {} bytes", sl.memory_reserved());
println!("utilization: {:.1}%", sl.memory_utilization() * 100.0);
println!("idle: {} bytes", sl.memory_idle());
println!("avg key size: {} bytes", sl.avg_key_size() as usize);
println!("avg value size: {} bytes", sl.avg_value_size() as usize);
println!("total insert attempts: {}", sl.total_inserts());
println!("under backpressure: {}", sl.is_under_backpressure());

Use Cases

LSM-tree memtables — RocksDB, LevelDB, CockroachDB style storage engines
Write-heavy key-value stores — high-throughput ingestion pipelines
Time-series append — sequential inserts, periodic flush to disk
Request-scoped caches — per-request memtable, bulk reclaim on completion

Performance

Benchmarks on AMD Ryzen 5 3600 (6-core, 3.6GHz). All numbers from cargo bench.

Concurrent writes (10K entries/thread, 64-byte keys)

Threads	Throughput	Latency/insert
4	9.4M ops/s	106ns
8	7.8M ops/s	128ns

Single-threaded latency

Operation	fastskip	BTreeMap	HashMap
insert (seq, 10K)	136µs	1.37ms	171µs
insert (rand, 10K)	2.2ms	1.5ms	212µs
get hit	80ns	43ns	20ns
get miss	38ns	91ns	16ns
cursor seek (1K)	36ns	397ns	—
cursor seek (10K)	60ns	5µs	—

fastskip trades single-threaded speed for lock-free concurrent writes — BTreeMap and HashMap require external locking for multi-threaded access, which destroys throughput.

v0.1.1 Optimizations

27% faster concurrent inserts (4 threads: 14.5ms → 10.6ms)
20% faster sequential inserts (10K: 1.7ms → 1.36ms)
21% faster random inserts (10K: 2.8ms → 2.2ms)
33% faster cursor seek (1K: 54ns → 36ns)
19% faster seal+iterate (1K: 155µs → 126µs)
O(1) memory_usage() — atomic running total instead of iterating all shards
Conditional allocation tracking — zero overhead when memory limits aren't configured
Bulk u64 header writes in init_node() (3 stores vs 8 individual stores)
Adaptive backoff on CAS failure reduces cache-line bouncing under contention
Fat LTO + strip in release profile for maximum codegen quality

See USAGE.md for complete API reference.

Configuration

use fastskip::ConcurrentSkipList;

// Default: CPU count shards, 64KB initial arena per shard
let sl = ConcurrentSkipList::new();

// Custom shard count
let sl = ConcurrentSkipList::with_shards(8);

// Custom capacity and shards
let sl = ConcurrentSkipList::with_capacity_and_shards(1024 * 1024, 8);

// With auto-seal thresholds
let sl = ConcurrentSkipList::with_capacity_and_shards(
    1024 * 1024,    // 1MB initial arena per shard
    8,              // 8 shards
    100 * 1024 * 1024, // Auto-seal at 100MB total memory
    10_000_000      // Auto-seal at 10M entries
);

Thread Safety

ConcurrentSkipList is Send + Sync — can be shared across threads:

use fastskip::ConcurrentSkipList;
use std::sync::Arc;
use std::thread;

let sl = Arc::new(ConcurrentSkipList::new());

let sl1 = sl.clone();
let t1 = thread::spawn(move || {
    for i in 0..1000 {
        sl1.insert(format!("key:{}", i).as_bytes(), b"value");
    }
});

let sl2 = sl.clone();
let t2 = thread::spawn(move || {
    for i in 1000..2000 {
        sl2.insert(format!("key:{}", i).as_bytes(), b"value");
    }
});

t1.join().unwrap();
t2.join().unwrap();

assert_eq!(sl.len(), 2000);

Design

Property	Detail
Algorithm	Lock-free skip list (RocksDB InlineSkipList / CockroachDB arenaskl)
Concurrency	Single CAS at level 0, best-effort CAS at upper levels
Memory	Per-thread arena shards, no per-node free
Ordering	Lexicographic byte ordering
Duplicates	Rejected (first value wins)
Delete	Tombstone (entry stays, marked deleted)
Re-insert after delete	Seal memtable, write to fresh one
Auto-seal	Size or count based thresholds
Backpressure	90% threshold warning
Batch ops	insert_batch, get_many

Testing

cargo test              # Unit, concurrent, stress, correctness tests
cargo clippy            # Zero warnings
cargo run --example basic
cargo run --example lsm_memtable

Documentation

See USAGE.md for complete API reference and examples.

fastskip 0.1.1