Skip to main content

Module binary

Module binary 

Source
Expand description

Binary Quantization (BQ): sign-bit encoding + Hamming distance.

Each dimension is encoded as a single bit: 1 if positive, 0 if negative. 32x compression (D/8 bytes vs 4D bytes for FP32). Best used as a coarse pre-filter: compute Hamming distance to quickly eliminate far candidates before computing exact distances on survivors.

Recall loss: 5-10% as a standalone index, but combined with a reranking step on top-K×10 candidates, effective recall loss is 1-3%.

Functions§

binary_size
Binary vector size in bytes for a given dimensionality.
encode
Encode an FP32 vector as binary: one bit per dimension.
encode_batch
Batch encode: encode all vectors into contiguous binary representation.
hamming_distance
Hamming distance between two binary-encoded vectors.
hamming_distance_fast
Hamming distance operating on u64 chunks for better throughput.