Expand description
Optimized Product Quantization (OPQ) — Non-Para OPQ via iterative SVD-Procrustes rotation that minimizes PQ reconstruction error, yielding 10–20% recall improvement over vanilla PQ at equal memory.
§Algorithm
OPQ wraps standard PQ with a learned rotation matrix R (dim × dim,
row-major) applied before codebook training and at query time:
encode(v) = PQ_encode(R · v)
distance(q, v) = ADC(R · q, PQ_code(v))§Non-Para OPQ (Ge et al., CVPR 2013)
The rotation is learned by alternating between two steps until convergence:
- Codebook step — hold
Rfixed, train PQ codebooks on the rotated training setR · Xvia Lloyd’s k-means. - Procrustes step — hold codebooks fixed, update
Rto minimize the Frobenius reconstruction error ‖R·X − reconstruct(quantize(R·X))‖_F via closed-form SVD:- Let
Y= dequantized reconstruction ofR·X(dim × N matrix). - Compute cross-correlation
M = X · Yᵀ(dim × dim). - SVD:
M = U · Σ · Vᵀ. - New rotation:
R = V · Uᵀ.
- Let
This alternation is repeated for opq_iters iterations (default 5).
§Storage format
QuantMode::Pq is reused in UnifiedQuantizedVector headers — OPQ is
structurally PQ post-rotation and requires no new on-disk discriminant.
The rotation matrix is stored in OpqCodec and applied transparently.
Structs§
- OpqCodec
- Optimized Product Quantization codec.
- OpqQuantized
- Quantized form returned by
OpqCodec::encode. - OpqQuery
- Prepared query: rotated vector + flat ADC distance table (M×K, row-major).