Skip to main content

Module version_set

Module version_set 

Source
Expand description

SuperVersion Metadata + Copy-on-Write Version Set

This module implements RocksDB-style SuperVersion metadata management for near lock-free reads. The key insight is that read paths only need a consistent snapshot of metadata (memtable, immutable memtables, SSTable levels), not exclusive access to the underlying data structures.

§Problem Analysis

Previous implementation suffered from lock contention:

  • Foreground reads must traverse memtable (RwLock) → immutables → levels
  • Each level switch may acquire additional locks
  • Compaction/flush modifies metadata while readers hold references
  • Lock-order complexity leads to deadlock potential

§Solution: SuperVersion + ArcSwap

┌─────────────────────────────────────────────────────────────────┐
│                       VersionSet                                 │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │ current: ArcSwap<SuperVersion>  ◄─── Atomic swap (O(1))     ││
│  └─────────────────────────────────────────────────────────────┘│
│                            │                                     │
│                            ▼                                     │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                    SuperVersion                              ││
│  │  ┌─────────────┐  ┌──────────────┐  ┌───────────────────┐  ││
│  │  │ memtable    │  │ immutables   │  │ level_files       │  ││
│  │  │ Arc<...>    │  │ Arc<Vec<..>> │  │ Arc<Vec<Level>>   │  ││
│  │  └─────────────┘  └──────────────┘  └───────────────────┘  ││
│  └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘

§Read Path (Lock-Free)

  1. Load current SuperVersion via atomic: let sv = version_set.get(); // O(1)
  2. Search memtable → immutables → levels using sv references
  3. Release sv when done (Arc drop)

No locks acquired. Readers are completely decoupled from writers.

§Write Path (Copy-on-Write)

  1. Clone current SuperVersion’s inner Arcs
  2. Modify the new copy (e.g., add new immutable memtable)
  3. Atomically swap new SuperVersion into place
  4. Old SuperVersion kept alive by existing readers (Arc)

§Complexity Analysis

OperationOld (with locks)New (SuperVersion)
Read acquireO(1) but contendedO(1) atomic load
Read releaseO(1) unlockO(1) Arc decrement
Metadata updateO(1) + lock waitO(changed_metadata) clone
Concurrent readsSerialized on RwLockTruly parallel

For 8 threads, expected speedup: ~6-8x on read-heavy workloads.

Structs§

BloomFilterHandle
Handle to a bloom filter (may be memory-mapped or in-memory)
FileMetadata
Metadata for an SSTable file
ImmutableMemTableRef
Reference to an immutable memtable (sealed, pending flush)
LevelMetadata
Metadata for a single level in the LSM tree
SuperVersion
SuperVersion: A consistent snapshot of all storage metadata
SuperVersionHandle
RAII handle for SuperVersion access
VersionSet
VersionSet manages the current SuperVersion and provides atomic updates

Traits§

ImmutableMemTable
Trait for immutable memtable operations