pub struct BinaryQuantizer;Expand description
Binary quantizer: f32 -> 1 bit (sign only).
Provides extreme compression (32x) at the cost of accuracy (~80% recall). Uses hamming distance for fast comparison. Best used with rescoring.
§Example
use grafeo_core::index::vector::quantization::BinaryQuantizer;
let v1 = vec![0.5f32, -0.3, 0.0, 0.8, -0.1, 0.2, -0.4, 0.9];
let v2 = vec![0.4f32, -0.2, 0.1, 0.7, -0.2, 0.3, -0.3, 0.8];
let bits1 = BinaryQuantizer::quantize(&v1);
let bits2 = BinaryQuantizer::quantize(&v2);
let dist = BinaryQuantizer::hamming_distance(&bits1, &bits2);
// Vectors are similar, so hamming distance should be low
assert!(dist < 4);Implementations§
Source§impl BinaryQuantizer
impl BinaryQuantizer
Sourcepub fn quantize(vector: &[f32]) -> Vec<u64>
pub fn quantize(vector: &[f32]) -> Vec<u64>
Quantizes f32 vector to binary (sign bits packed in u64).
Each f32 becomes 1 bit: 1 if >= 0, 0 if < 0. Bits are packed into u64 words (64 dimensions per word).
Sourcepub fn quantize_batch(vectors: &[&[f32]]) -> Vec<Vec<u64>>
pub fn quantize_batch(vectors: &[&[f32]]) -> Vec<Vec<u64>>
Quantizes multiple vectors in batch.
Sourcepub fn hamming_distance(a: &[u64], b: &[u64]) -> u32
pub fn hamming_distance(a: &[u64], b: &[u64]) -> u32
Computes hamming distance between binary vectors.
Counts the number of differing bits. Lower = more similar.
Sourcepub fn hamming_distance_normalized(
a: &[u64],
b: &[u64],
dimensions: usize,
) -> f32
pub fn hamming_distance_normalized( a: &[u64], b: &[u64], dimensions: usize, ) -> f32
Computes normalized hamming distance (0.0 to 1.0).
Returns the fraction of bits that differ.
Sourcepub fn approximate_euclidean(a: &[u64], b: &[u64], dimensions: usize) -> f32
pub fn approximate_euclidean(a: &[u64], b: &[u64], dimensions: usize) -> f32
Estimates Euclidean distance from hamming distance.
Uses an empirical approximation: d_euclidean ≈ sqrt(2 * hamming / dim). This is a rough estimate suitable for initial filtering.
Sourcepub const fn words_needed(dimensions: usize) -> usize
pub const fn words_needed(dimensions: usize) -> usize
Returns the number of u64 words needed for the given dimensions.
Sourcepub const fn bytes_needed(dimensions: usize) -> usize
pub const fn bytes_needed(dimensions: usize) -> usize
Returns the memory footprint in bytes for quantized storage.
Auto Trait Implementations§
impl Freeze for BinaryQuantizer
impl RefUnwindSafe for BinaryQuantizer
impl Send for BinaryQuantizer
impl Sync for BinaryQuantizer
impl Unpin for BinaryQuantizer
impl UnwindSafe for BinaryQuantizer
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more