Skip to main content

Module compressed_routing

Module compressed_routing 

Source
Expand description

§Cache-Resident Routing Layer (Task 4)

Ensures routing fits in LLC (Last-Level Cache) to bound latency variance.

§Architecture

Two-stage routing with compressed centroids:

  1. Coarse stage: FP16/int8 centroids in compressed space (fits in LLC)
  2. Fine stage: Refine top candidates (optionally in full precision)

§Math/Algorithm

Cache complexity constraint: ensure routing working set W ≤ LLC_size

For C centroids of dimension d:

  • FP32: 4·C·d bytes (often exceeds LLC)
  • FP16: 2·C·d bytes (50% reduction)
  • Int8: C·d bytes (75% reduction)
  • PQ: C·(d/m)·1 bytes (even smaller for high-dim)

Multi-stage ranking: O(C·d_compressed + k·d_full)

§Usage

use sochdb_vector::compressed_routing::{RoutingLayer, RoutingConfig, CentroidCompression};

let config = RoutingConfig::default()
    .compression(CentroidCompression::Fp16)
    .refine_top_k(32);

let routing = RoutingLayer::build(&centroids, config);
let top_lists = routing.route(&query, 16);

Structs§

Fp16Centroids
FP16 encoded centroid storage
Int8Centroids
Int8 quantized centroid storage
ListCandidate
Candidate list from routing
RoutingConfig
Configuration for routing layer
RoutingLayer
Compressed routing layer for cache-resident operations
RoutingStats
Routing statistics

Enums§

CentroidCompression
Compression method for centroids