Skip to main content

ScalarQuantizer

Struct ScalarQuantizer 

Source
pub struct ScalarQuantizer { /* private fields */ }
Expand description

A central parameter collection for a scalar quantization schema.

§Example

An self-contained end-to-end example containing training, compression, and distance computations is shown below.

use diskann_quantization::{
    AsFunctor, CompressInto,
    distances,
    num::Positive, bits::MutBitSlice,
    scalar::{
        self,
        ScalarQuantizer,
        train::ScalarQuantizationParameters,
        CompensatedVector, MutCompensatedVectorRef,
        CompensatedIP, CompensatedSquaredL2,
    }
};
use diskann_utils::{views::Matrix, Reborrow, ReborrowMut};
use diskann_vector::DistanceFunction;

// A small training set consisting of two 5-dimensional vectors.
let mut data = Matrix::<f32>::new(0.0, 2, 5);
data.row_mut(0).copy_from_slice(&[-1.0, -1.0, -1.0, -1.0, -1.0]);
data.row_mut(1).copy_from_slice(&[1.0, 1.0, 1.0, 1.0, 1.0]);

let trainer = ScalarQuantizationParameters::new(Positive::new(1.0).unwrap());
let quantizer: ScalarQuantizer = trainer.train(data.as_view());

// The dimension of the quantizer is based on the dimension of the training data.
assert_eq!(quantizer.dim(), data.ncols());

// Compress the two input vectors.
// For one vector, we will use the "boxed" API. The other we will construct "manually".

// Boxed API
let mut c0 = CompensatedVector::<8>::new_boxed(data.ncols());

// Manual construction.
let mut buffer: Vec<u8> = vec![0; c0.vector().bytes()];
let mut compensation = scalar::Compensation(0.0);
let mut c1 = MutCompensatedVectorRef::new(
    MutBitSlice::new(buffer.as_mut_slice(), data.ncols()).unwrap(),
    &mut compensation
);

quantizer.compress_into(data.row(0), c0.reborrow_mut()).unwrap();
quantizer.compress_into(data.row(1), c1.reborrow_mut()).unwrap();

// Compute inner product.
let ip: CompensatedIP = quantizer.as_functor();
let distance: distances::Result<f32> = ip.evaluate_similarity(c0.reborrow(), c1.reborrow());

// The inner product computation to `f32` is the same as a SimilarityScore and is
// therefore negative of the mathematical value.
assert!((distance.unwrap() - 5.0).abs() < 0.00001);

// Compute squared eudlicean distance.
let l2: CompensatedSquaredL2 = quantizer.as_functor();
let distance: distances::Result<f32> = l2.evaluate_similarity(c0.reborrow(), c1.reborrow());
assert!((distance.unwrap() - 20.0).abs() < 0.00001);

Implementations§

Source§

impl ScalarQuantizer

Source

pub fn new(scale: f32, shift: Vec<f32>, mean_norm: Option<f32>) -> Self

Construct a new scalar quantizer.

Source

pub fn dim(&self) -> usize

Return the number dimensions this ScalarQuantizer has been trained for.

Source

pub fn scale(&self) -> f32

Return the scaling coefficient.

Source

pub fn shift_square_norm(&self) -> f32

Return the square norm of the dataset shift.

Source

pub fn shift(&self) -> &[f32]

Return the per-dimension shift vector.

This vector is meant to accomplish two goals:

  1. Centers the data around the training dataset mean.
  2. Offsets each dimension into a range that can be encoded in unsigned values.
Source

pub fn mean_norm(&self) -> Option<f32>

Return the average norm of vectors in the training set.

Source

pub fn rescale(&self, x: &mut [f32]) -> Result<(), MeanNormMissing>

Rescale the argument so it has the average norm of the training set.

This can be used to help with compression queries that come from a different distribution when the norm of the query may be safely discarded for purposes of distance computations.

This operation can fail is the mean norm was not computed during training.

Source

pub fn compare(&self, other: &Self) -> Result<(), SQComparisonError>

Compare two ScalarQuantizer instances field by field. On success, returns Ok(()). On failure, returns Err(SQComparisonError) explaining which field differs.

Trait Implementations§

Source§

impl AsFunctor<CompensatedCosineNormalized> for ScalarQuantizer

Source§

impl AsFunctor<CompensatedIP> for ScalarQuantizer

Source§

impl AsFunctor<CompensatedSquaredL2> for ScalarQuantizer

Source§

impl Clone for ScalarQuantizer

Source§

fn clone(&self) -> ScalarQuantizer

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<const NBITS: usize, T, Perm> CompressInto<&[T], BitSliceBase<NBITS, Unsigned, MutSlicePtr<'_, u8>, Perm>> for ScalarQuantizer
where T: Copy + Into<f32>, Unsigned: Representation<NBITS>, Perm: PermutationStrategy<NBITS>,

Source§

fn compress_into( &self, from: &[T], into: MutBitSlice<'_, NBITS, Unsigned, Perm>, ) -> Result<(), Self::Error>

Compress the input vector from into the bitslice into.

This method does not compute compensation coefficients required for fast inner product computations. If only L2 distances is desired, this method can be slightly faster.

§Error

Returns an error if the input contains NaN.

§Panics

Panics if:

  • from.len() != self.dim(): Vector to be compressed must have the same dimensionality as the quantizer.
  • into.len() != self.dim(): Compressed vector must have the same dimensionality as the quantizer.
Source§

type Error = InputContainsNaN

Errors that may occur during compression.
Source§

type Output = ()

An output type resulting from compression.
Source§

impl<const NBITS: usize, T, Perm> CompressInto<&[T], VectorBase<NBITS, Unsigned, MutSlicePtr<'_, u8>, Mut<'_, Compensation>, Perm>> for ScalarQuantizer
where T: Copy + Into<f32>, Unsigned: Representation<NBITS>, Perm: PermutationStrategy<NBITS>,

Source§

fn compress_into( &self, from: &[T], into: MutCompensatedVectorRef<'_, NBITS, Perm>, ) -> Result<(), Self::Error>

Compress the input vector from into the bitslice into.

This method computes and stores the compensation coefficient required for fast inner product computations.

§Error

Returns an error if the input contains NaN.

§Panics

Panics if:

  • from.len() != self.dim(): Vector to be compressed must have the same dimensionality as the quantizer.
  • into.len() != self.dim(): Compressed vector must have the same dimensionality as the quantizer.
Source§

type Error = InputContainsNaN

Errors that may occur during compression.
Source§

type Output = ()

An output type resulting from compression.
Source§

impl Debug for ScalarQuantizer

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> ByRef<T> for T

Source§

fn by_ref(&self) -> &T

Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Generator<T> for T
where T: Clone,

Source§

fn generate(&mut self) -> T

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> AsyncFriendly for T
where T: Send + Sync + 'static,