Expand description
Streaming updates for vector indices.
§The Problem
Real-world vector databases face continuous updates:
- Documents added as they’re created
- Documents deleted for GDPR/legal compliance
- Embeddings updated when models change
Naive approaches rebuild the entire index, which is expensive. This module provides efficient streaming update primitives.
§Architecture
Stream of Updates
│
▼
┌──────────────┐
│ StreamBuffer │ ◄── Batches small updates
└──────┬───────┘
│
▼ (periodic merge)
┌──────────────┐
│ Main Index │ ◄── HNSW, DiskANN, etc.
└──────────────┘§Update Types
| Type | Complexity | Notes |
|---|---|---|
| Insert | O(log n) amortized | Buffered, merged periodically |
| Delete | O(degree) | Mark-and-repair (IP-DiskANN pattern) |
| Update | Delete + Insert | Atomic replacement |
§Research Context
-
FreshDiskANN (Singh et al. 2021): StreamingMerge for efficient insert/delete
- Key insight: Maintain a small “fresh” index alongside main index
- Insertions go to fresh index; periodic merge into main
- Deletions: tombstone + lazy cleanup during merge
-
IP-DiskANN (2025): In-neighbor tracking for O(1) delete preparation
- Each node tracks its in-neighbors (who points to it)
- Delete: O(degree) edge repairs instead of O(n) search
-
SPFresh (2024): Streaming vector search with LIRE (Lazy In-place REfresh)
- Lazy updates: defer graph repairs until node is visited
- Achieves 2-5x throughput over eager updates
-
LSM-VEC (2025): LSM-tree approach for streaming vectors
- Multiple levels of indices, periodic compaction
- Write-optimized: O(1) amortized insert
§Our Approach
We implement a hybrid:
- Write buffer: Absorbs burst writes (like LSM memtable)
- Tombstone deletes: Mark-and-sweep during compaction
- Merged search: Query both buffer and main index
- Periodic compaction: Merge buffer into main index
§Example
ⓘ
use vicinity::streaming::StreamingCoordinator;
use vicinity::hnsw::{InPlaceIndex, InPlaceConfig};
let index = InPlaceIndex::new(128, InPlaceConfig::default());
let mut coord = StreamingCoordinator::new(index);
coord.insert(0, vec![0.1; 128])?;
coord.insert(1, vec![0.2; 128])?;
coord.delete(0)?;
let results = coord.search(&vec![0.15; 128], 10)?;
coord.compact()?;Re-exports§
Modules§
- lsm
- LSM-tiered streaming vector index.
Structs§
- Stream
Buffer - A buffer for streaming vector updates.
- Stream
Buffer Config - Configuration for the stream buffer.
- Streaming
Config - Configuration for streaming coordinator.
- Streaming
Coordinator - Coordinator that wraps an index with streaming update support.
- Streaming
Stats - Statistics about streaming index state.
- Update
Stats - Statistics from applying updates.
Traits§
- Index
Ops - Trait for index operations needed by streaming coordinator.