Expand description
Bit-packing, XNOR-popcount, and the two RaBitQ distance estimators:
-
Symmetric Charikar-style angular estimator — both query and database are 1-bit. Derived from hyperplane-LSH collision probability: E[B/D] = 1 − θ/π This is what the shipped crate had at commit
f2dbb6efb. -
Asymmetric RaBitQ-2024 inner-product estimator — query stays in f32, database is 1-bit
b_i ∈ {−1/√D, +1/√D}. Inner product is reconstructed by summing the rotated query’s components with signs, then rescaled by a precomputed factor derived from the stored unit-sphere inner-product bias. Unbiased for Haar-uniform rotations with O(1/√D) variance.
The asymmetric path closes the gap between this crate’s estimator and the SIGMOD 2024 paper (Gao & Long) — the symmetric path remains for apples-to-apples comparison against naive 1-bit codes.
§Bit packing
Each dimension is one bit: 1 if the rotated value ≥ 0, else 0. Bits are
packed MSB-first into u64 words. When D % 64 != 0 the last word carries
64·n_words − D padding bits that are zero in every code; XNOR-popcount
must mask those bits off before counting, otherwise padding bits always
agree and the estimator is biased. masked_xnor_popcount handles this
correctly; xnor_popcount is retained for the aligned case.
Structs§
- Binary
Code - A packed binary code representing one vector (D bits).
Functions§
- pack_
bits - Pack bits from a boolean slice into u64 words (for testing/utilities).
- unpack_
bits - Unpack u64 words back into a bool slice of length
dim.