Module opq

Expand description

Optimized Product Quantization (OPQ) — Non-Para OPQ via iterative SVD-Procrustes rotation that minimizes PQ reconstruction error, yielding 10–20% recall improvement over vanilla PQ at equal memory.

§Algorithm

OPQ wraps standard PQ with a learned rotation matrix R (dim × dim, row-major) applied before codebook training and at query time:

encode(v) = PQ_encode(R · v)
distance(q, v) = ADC(R · q, PQ_code(v))

§Non-Para OPQ (Ge et al., CVPR 2013)

The rotation is learned by alternating between two steps until convergence:

Codebook step — hold R fixed, train PQ codebooks on the rotated training set R · X via Lloyd’s k-means.
Procrustes step — hold codebooks fixed, update R to minimize the Frobenius reconstruction error ‖R·X − reconstruct(quantize(R·X))‖_F via closed-form SVD:
- Let Y = dequantized reconstruction of R·X (dim × N matrix).
- Compute cross-correlation M = X · Yᵀ (dim × dim).
- SVD: M = U · Σ · Vᵀ.
- New rotation: R = V · Uᵀ.

This alternation is repeated for opq_iters iterations (default 5).

§Storage format

QuantMode::Pq is reused in UnifiedQuantizedVector headers — OPQ is structurally PQ post-rotation and requires no new on-disk discriminant. The rotation matrix is stored in OpqCodec and applied transparently.

Structs§

OpqCodec: Optimized Product Quantization codec.
OpqQuantized: Quantized form returned by OpqCodec::encode.
OpqQuery: Prepared query: rotated vector + flat ADC distance table (M×K, row-major).

Module opq

Module opq Copy item path

§Algorithm

§Non-Para OPQ (Ge et al., CVPR 2013)

§Storage format

Structs§

Module opq