Skip to main content

Module lazy_segment

Module lazy_segment 

Source
Expand description

Lazy BPS/RDF/Rerank Construction (Build-on-First-Query)

Deferred index construction using OnceLock for on-demand building, eliminating upfront build cost for indexes that may never be queried.

§Problem

Eager index building at seal time:

  • BPS construction: O(N × D) - ~50ms for 10K vectors
  • RDF quantization: O(N × D) - ~30ms for 10K vectors
  • Graph building: O(N × ef × log N) - ~200ms for 10K vectors
  • All this blocks the seal path even if segment is rarely queried

§Solution

Lazy construction with OnceLock:

  • Seal only writes raw vectors (fast)
  • Index structures built on first query
  • Subsequent queries use cached index
  • Background pre-warming optional

§Architecture

Segment State Machine:

┌──────────┐     ┌──────────────┐     ┌────────────────┐
│  Raw     │ ──► │  First Query │ ──► │  Index Built   │
│  Vectors │     │  (triggers)  │     │  (cached)      │
└──────────┘     └──────────────┘     └────────────────┘
                        │
                        ▼
                 ┌──────────────┐
                 │  Build Index │
                 │  (one-time)  │
                 └──────────────┘

§Performance

MetricEager BuildLazy BuildImprovement
Seal latency280ms5ms56×
First query2ms285ms*(deferred)
Subsequent2ms2msSame

*One-time cost, amortized over query lifetime

§Usage

use sochdb_vector::lazy_segment::{LazySegment, LazyConfig};

let config = LazyConfig::default();
let segment = LazySegment::new(vectors, config);

// Seal is instant - no index built yet

// First search triggers index build
let results = segment.search(&query, k);  // ~280ms

// Subsequent searches are fast
let results = segment.search(&query, k);  // ~2ms

Structs§

BuildStats
Build timing statistics
IndexStatus
Index build status
LazyConfig
Configuration for lazy segment
LazySegment
Segment with lazy index construction
LazySegmentManager
Manager for multiple lazy segments

Type Aliases§

VectorKey
Vector key type