Expand description
Streaming updates for vector indices.
§The Problem
Real-world vector databases face continuous updates:
- Documents added as they’re created
- Documents deleted for GDPR/legal compliance
- Embeddings updated when models change
Naive approaches rebuild the entire index, which is expensive. This module provides efficient streaming update primitives.
§Architecture
Stream of Updates
│
▼
┌──────────────┐
│ StreamBuffer │ ◄── Batches small updates
└──────┬───────┘
│
▼ (periodic merge)
┌──────────────┐
│ Main Index │ ◄── HNSW, DiskANN, etc.
└──────────────┘§Update Types
| Type | Complexity | Notes |
|---|---|---|
| Insert | O(log n) amortized | Buffered, merged periodically |
| Delete | O(degree) | Mark-and-repair (IP-DiskANN pattern) |
| Update | Delete + Insert | Atomic replacement |
§Research Context
-
FreshDiskANN (Singh et al. 2021): StreamingMerge for efficient insert/delete
- Key insight: Maintain a small “fresh” index alongside main index
- Insertions go to fresh index; periodic merge into main
- Deletions: tombstone + lazy cleanup during merge
-
IP-DiskANN (2025): In-neighbor tracking for O(1) delete preparation
- Each node tracks its in-neighbors (who points to it)
- Delete: O(degree) edge repairs instead of O(n) search
-
SPFresh (2024): Streaming vector search with LIRE (Lazy In-place REfresh)
- Lazy updates: defer graph repairs until node is visited
- Achieves 2-5x throughput over eager updates
-
LSM-VEC (2025): LSM-tree approach for streaming vectors
- Multiple levels of indices, periodic compaction
- Write-optimized: O(1) amortized insert
§Our Approach
We implement a hybrid:
- Write buffer: Absorbs burst writes (like LSM memtable)
- Tombstone deletes: Mark-and-sweep during compaction
- Merged search: Query both buffer and main index
- Periodic compaction: Merge buffer into main index
§Example
ⓘ
use vicinity::streaming::{StreamingIndex, UpdateOp};
use vicinity::hnsw::HNSWIndex;
let mut index = StreamingIndex::new(HNSWIndex::new(128)?);
// Stream of updates
index.apply(UpdateOp::Insert { id: 0, vector: vec![0.1; 128] })?;
index.apply(UpdateOp::Insert { id: 1, vector: vec![0.2; 128] })?;
index.apply(UpdateOp::Delete { id: 0 })?;
// Search works across buffered and merged data
let results = index.search(&query, 10)?;
// Periodic compaction
index.compact()?;Structs§
- Stream
Buffer - A buffer for streaming vector updates.
- Stream
Buffer Config - Configuration for the stream buffer.
- Streaming
Config - Configuration for streaming coordinator.
- Streaming
Coordinator - Coordinator that wraps an index with streaming update support.
- Streaming
Stats - Statistics about streaming index state.
- Update
Batch - A batch of update operations.
- Update
Stats - Statistics from applying updates.
Enums§
- Update
Op - A single update operation.
Traits§
- Index
Ops - Trait for index operations needed by streaming coordinator.
- Streaming
Index - Trait for indices that support streaming updates.