Expand description
§Cache-Resident Routing Layer (Task 4)
Ensures routing fits in LLC (Last-Level Cache) to bound latency variance.
§Architecture
Two-stage routing with compressed centroids:
- Coarse stage: FP16/int8 centroids in compressed space (fits in LLC)
- Fine stage: Refine top candidates (optionally in full precision)
§Math/Algorithm
Cache complexity constraint: ensure routing working set W ≤ LLC_size
For C centroids of dimension d:
- FP32: 4·C·d bytes (often exceeds LLC)
- FP16: 2·C·d bytes (50% reduction)
- Int8: C·d bytes (75% reduction)
- PQ: C·(d/m)·1 bytes (even smaller for high-dim)
Multi-stage ranking: O(C·d_compressed + k·d_full)
§Usage
ⓘ
use sochdb_vector::compressed_routing::{RoutingLayer, RoutingConfig, CentroidCompression};
let config = RoutingConfig::default()
.compression(CentroidCompression::Fp16)
.refine_top_k(32);
let routing = RoutingLayer::build(¢roids, config);
let top_lists = routing.route(&query, 16);Structs§
- Fp16
Centroids - FP16 encoded centroid storage
- Int8
Centroids - Int8 quantized centroid storage
- List
Candidate - Candidate list from routing
- Routing
Config - Configuration for routing layer
- Routing
Layer - Compressed routing layer for cache-resident operations
- Routing
Stats - Routing statistics
Enums§
- Centroid
Compression - Compression method for centroids