Expand description
Binary Quantization (BQ): sign-bit encoding + Hamming distance.
Each dimension is encoded as a single bit: 1 if positive, 0 if negative. 32x compression (D/8 bytes vs 4D bytes for FP32). Best used as a coarse pre-filter: compute Hamming distance to quickly eliminate far candidates before computing exact distances on survivors.
Recall loss: 5-10% as a standalone index, but combined with a reranking step on top-K×10 candidates, effective recall loss is 1-3%.
Functions§
- binary_
size - Binary vector size in bytes for a given dimensionality.
- encode
- Encode an FP32 vector as binary: one bit per dimension.
- encode_
batch - Batch encode: encode all vectors into contiguous binary representation.
- hamming_
distance - Hamming distance between two binary-encoded vectors.
- hamming_
distance_ fast - Hamming distance operating on u64 chunks for better throughput.