Skip to main content

Module rabitq

Module rabitq 

Source
Expand description

RaBitQ — 1-bit quantization with O(1/√D) MSE error bound (SIGMOD 2024).

§Algorithm outline

  1. Calibration: compute centroid c over training vectors.
  2. Rotation: apply a randomised Walsh-Hadamard transform (WHT) with a seed-derived signed-diagonal matrix D to the residual v - c.
    • WHT requires a power-of-2 length; dimensions that are not pow2 are zero-padded before the transform then truncated after.
    • D is a vector of ±1 scalars derived deterministically from rotation_seed using an xorshift64 generator.
  3. Encoding: code = sign(R·(v-c)), packed 1-bit-per-dimension.
  4. Distance estimation (L2): ‖q-v‖² ≈ ‖v-c‖² + ‖q-c‖² − 2‖v-c‖‖q-c‖·(1 − 2·hamming/D) with O(1/√D) MSE — see SIGMOD 2024 Theorem 4.

§IP / raw inner-product bias

The angular estimator above is MSE-optimal for L2 and cosine. For raw IP it carries a systematic bias. When bias_correct = true the codec subtracts dot_quantized (stored in the QuantHeader) from the asymmetric distance to partially compensate, following the TurboQuant-style QJL residual correction. This is off by default; consumers that use raw IP (RAG, attention scores) should enable it.

[rand_xoshiro] is not a dependency of nodedb-codec. The signed- diagonal flip vector is derived from rotation_seed via an inline xorshift64 generator — no external crate required.

Structs§

RaBitQCodec
RaBitQ codec: 1-bit quantization with O(1/√D) MSE error bound.
RaBitQQuantized
Packed 1-bit quantized vector produced by RaBitQCodec::encode.
RaBitQQuery
Prepared query for RaBitQCodec distance computations.