Skip to main content

Module streaming

Module streaming 

Source
Expand description

Streaming updates for vector indices.

§The Problem

Real-world vector databases face continuous updates:

  • Documents added as they’re created
  • Documents deleted for GDPR/legal compliance
  • Embeddings updated when models change

Naive approaches rebuild the entire index, which is expensive. This module provides efficient streaming update primitives.

§Architecture

Stream of Updates
    │
    ▼
┌──────────────┐
│ StreamBuffer │ ◄── Batches small updates
└──────┬───────┘
       │
       ▼ (periodic merge)
┌──────────────┐
│  Main Index  │ ◄── HNSW, DiskANN, etc.
└──────────────┘

§Update Types

TypeComplexityNotes
InsertO(log n) amortizedBuffered, merged periodically
DeleteO(degree)Mark-and-repair (IP-DiskANN pattern)
UpdateDelete + InsertAtomic replacement

§Research Context

  • FreshDiskANN (Singh et al. 2021): StreamingMerge for efficient insert/delete

    • Key insight: Maintain a small “fresh” index alongside main index
    • Insertions go to fresh index; periodic merge into main
    • Deletions: tombstone + lazy cleanup during merge
  • IP-DiskANN (2025): In-neighbor tracking for O(1) delete preparation

    • Each node tracks its in-neighbors (who points to it)
    • Delete: O(degree) edge repairs instead of O(n) search
  • SPFresh (2024): Streaming vector search with LIRE (Lazy In-place REfresh)

    • Lazy updates: defer graph repairs until node is visited
    • Achieves 2-5x throughput over eager updates
  • LSM-VEC (2025): LSM-tree approach for streaming vectors

    • Multiple levels of indices, periodic compaction
    • Write-optimized: O(1) amortized insert

§Our Approach

We implement a hybrid:

  1. Write buffer: Absorbs burst writes (like LSM memtable)
  2. Tombstone deletes: Mark-and-sweep during compaction
  3. Merged search: Query both buffer and main index
  4. Periodic compaction: Merge buffer into main index

§Example

use vicinity::streaming::{StreamingIndex, UpdateOp};
use vicinity::hnsw::HNSWIndex;

let mut index = StreamingIndex::new(HNSWIndex::new(128)?);

// Stream of updates
index.apply(UpdateOp::Insert { id: 0, vector: vec![0.1; 128] })?;
index.apply(UpdateOp::Insert { id: 1, vector: vec![0.2; 128] })?;
index.apply(UpdateOp::Delete { id: 0 })?;

// Search works across buffered and merged data
let results = index.search(&query, 10)?;

// Periodic compaction
index.compact()?;

Structs§

StreamBuffer
A buffer for streaming vector updates.
StreamBufferConfig
Configuration for the stream buffer.
StreamingConfig
Configuration for streaming coordinator.
StreamingCoordinator
Coordinator that wraps an index with streaming update support.
StreamingStats
Statistics about streaming index state.
UpdateBatch
A batch of update operations.
UpdateStats
Statistics from applying updates.

Enums§

UpdateOp
A single update operation.

Traits§

IndexOps
Trait for index operations needed by streaming coordinator.
StreamingIndex
Trait for indices that support streaming updates.