Expand description
Residual quantization codec for WARP
This module implements the residual quantization scheme used to compress token embeddings in the WARP algorithm. Each vector is decomposed into:
- A centroid (learned via k-means)
- A residual (difference from centroid), quantized to 2-4 bits per dimension
The codec enables efficient scoring without full decompression by using precomputed centroid scores and bucket weights.
Structsยง
- Residual
Codec - Residual quantization codec for compressing token embeddings.
- Residual
Codec Builder - Builder for creating a
ResidualCodecfrom aWarpIndexConfig.