Expand description
LSM-Tree Style Streaming Index Compaction
Implements a Log-Structured Merge-tree (LSM-tree) index for write-heavy
vector workloads. Writes are absorbed by an in-memory MemTable and
flushed into immutable, sorted Segments across tiered levels.
Compaction merges segments to bound read amplification.
LSM-trees turn random writes into sequential appends, ideal for high-throughput ingestion, streaming embedding updates, and frequent deletes (tombstone-based).
Structsยง
- Bloom
Filter - Probabilistic set using double-hashing:
h_i(x) = h1(x) + i * h2(x). - Compaction
Config - Configuration for the LSM-tree index.
- LSMIndex
- Write-optimised vector index using LSM-tree tiered compaction.
- LSMStats
- Runtime statistics.
- MemTable
- In-memory sorted write buffer backed by
BTreeMap. - Segment
- Immutable sorted run with bloom filter for point lookups.