Expand description
Lazy BPS/RDF/Rerank Construction (Build-on-First-Query)
Deferred index construction using OnceLock for on-demand building, eliminating upfront build cost for indexes that may never be queried.
§Problem
Eager index building at seal time:
- BPS construction: O(N × D) - ~50ms for 10K vectors
- RDF quantization: O(N × D) - ~30ms for 10K vectors
- Graph building: O(N × ef × log N) - ~200ms for 10K vectors
- All this blocks the seal path even if segment is rarely queried
§Solution
Lazy construction with OnceLock:
- Seal only writes raw vectors (fast)
- Index structures built on first query
- Subsequent queries use cached index
- Background pre-warming optional
§Architecture
Segment State Machine:
┌──────────┐ ┌──────────────┐ ┌────────────────┐
│ Raw │ ──► │ First Query │ ──► │ Index Built │
│ Vectors │ │ (triggers) │ │ (cached) │
└──────────┘ └──────────────┘ └────────────────┘
│
▼
┌──────────────┐
│ Build Index │
│ (one-time) │
└──────────────┘§Performance
| Metric | Eager Build | Lazy Build | Improvement |
|---|---|---|---|
| Seal latency | 280ms | 5ms | 56× |
| First query | 2ms | 285ms* | (deferred) |
| Subsequent | 2ms | 2ms | Same |
*One-time cost, amortized over query lifetime
§Usage
use sochdb_vector::lazy_segment::{LazySegment, LazyConfig};
let config = LazyConfig::default();
let segment = LazySegment::new(vectors, config);
// Seal is instant - no index built yet
// First search triggers index build
let results = segment.search(&query, k); // ~280ms
// Subsequent searches are fast
let results = segment.search(&query, k); // ~2msStructs§
- Build
Stats - Build timing statistics
- Index
Status - Index build status
- Lazy
Config - Configuration for lazy segment
- Lazy
Segment - Segment with lazy index construction
- Lazy
Segment Manager - Manager for multiple lazy segments
Type Aliases§
- Vector
Key - Vector key type