Skip to main content

Module binary_quantize

Module binary_quantize 

Source
Expand description

Binary Quantization for Vector Embeddings

Compresses fp32 vectors to binary (1 bit per dimension) for ultra-fast approximate nearest neighbor search using Hamming distance.

§Compression Ratio

  • fp32: 4 bytes per dimension
  • binary: 1 bit per dimension = 32x compression

Example: 1024-dim vector

  • fp32: 4096 bytes
  • binary: 128 bytes

§Algorithm

Simple sign-based quantization:

  • positive values → 1
  • negative/zero values → 0

For normalized embeddings (e.g., from sentence transformers), this preserves ~95-97% of retrieval quality.

§Usage

// Quantize a vector
let binary = BinaryVector::from_f32(&embedding);

// Compute Hamming distance (number of differing bits)
let distance = binary.hamming_distance(&other);

// For retrieval: lower Hamming distance = more similar

§References

  • “Embedding Quantization” - HuggingFace Blog
  • Binary embedding with Matryoshka representation learning

Structs§

BinaryIndex
Index of binary vectors for fast batch search
BinarySearchResult
Result from binary search with rescoring capability
BinaryVector
Binary quantized vector stored as packed u64 words

Functions§

hamming_distance_simd
Compute Hamming distance between two packed binary vectors using SIMD