Expand description
A collection of utilities and algorithms for training quantizers, compressing data, and performing distance computations on compressed data.
§Quantizers
The currently implemented quantizers are listed below. Refer to the documentation within each module for more information.
Note that the capabilities of each quantizer are varied and some aspects are still in progress. If you have any questions or need something implemented, please reach out!
-
Scalar: Compressing dimensions to 8-bits and lower.
-
Binary: Light weight 1-bit compression.
-
Product: Compressing multiple dimensions into a single code. NOTE: Support for product quantization is currently limited as the implementation is spread between this crate and the original implementation in DiskANN.
-
Spherical: A compression technique similar to scalar quantization, put with the characteristic that vectors are normalized, transformed, and then projected onto a unit sphere.
This class of algorithms is based off the RabitQ paper and associated work.
-
MinMax: A compression similar to scalar quantization, but where the the coefficients to quantize are computed on a per-vector basis; allowing this to be used in a streaming setting (i.e. without any need for training).
§BitSlice
A primary abstraction introduced is a BitSlice. Think of it as a Rust slice, but for packing together bits. It currently supports unsigned integers of widths 1 to 8. The borrowed version of the slice consists of a pointer (with a lifetime) and a length and is therefore just 16-bytes in size and amenable to the niche optimization. It also comes in a boxed variant for convenience.
Some trickery is involved in the constructor to ensure that the pointer can only ever be obtained in a safe way.
§Distances over BitSlices
Distances over bit-slices are a key-component for building efficient distances for binary and scalar quantized vectors.
Performance characteristics of the implemented functions are listed here:
-
crate::distances::SquaredL2/crate::distances::InnerProduct: Fast SIMD implementations available for 4 and 8-bit vectors. Fallback implementations are used for 2, 3, 5, 6, and 7-bit vectors. Scalar pop-count used for 1-bit distances. -
crate::distances::Hamming: Scalar pop-count used.
§Adding New Quantizers
Adding new quantizers is a fairly straight-forward.
- Implement training using a similar style to existing quantizers by:
A. Define a quantizer that will hold the final any central quantization state.
B. Implement a training module with training parameters and an associated train
method (available either as a trait or inherhent method, whichever makes more
sense).
-
Determine the representation for compressed vectors. This can be done using combinations of
crate::bits::BitSlices, utilities in theownershipmodule and more.Ensure that this representation implements
diskann_utils::Reborrowanddiskann_utils::ReborrowMutso that working with generalized references types is sane. -
Implement distance functors (e.g.
crate::scalar::CompensatedIP) to enable distance computations on compressed vectors and between compress and full-precision vectors.Additionally, implement
crate::AsFunctorfor the central quantizer for each distance functor.
§Design Considerations
In this section, we will cover some of the design philosophy regarding quantizers, particularly how this interacts with vector search and data providers.
§No Inherent Methods for Distances in Quantizers
Inherent methods for distances on quantizer structs such as
fn distance_l2(&self, full_precision: &[f32], quantized: &[u8]) -> f32fn distance_qq_l2(&self, quantized_0: &[u8], quantized_1: &[u8]) -> f32
do not work well for a number of reasons:
-
Inherent methods are unavailable to call via trait restrictions on generic arguments.
-
Fixed argument/return types limit flexibility. One example is marking the difference between
diskann_vector::MathematicalValueanddiskann_vector::SimilarityScore.The solution to accepting multiple argument/return types is exactly the approach taken in this crate, i.e. using auxiliary
diskann_vector::DistanceFunctionimplementations on light-weight types (e.g.crate::scalar::CompensatedIP). -
This approach does not work when hefty query preprocessing can make distance computations faster, such as with table-based PQ implementations.
Instead, creating a
diskann_vector::PreprocessedDistanceFunctioncan be used. -
Does not really extend well to performing batches of distance computations when that can be computed more efficiently.
§No Common Trait for Quantizer-based Distances
Creating a trait for quantizer-based distances would solve point number 1 above about usability in generic contexts, but the other points still remain. I argue that a functor-based interface is generally more flexible and useful.
Additionally, not all quantizers implement the same distance functions. For example,
binary quantization only uses crate::distances::Hamming.
§No Common Trait for Training
The requirements for training between scalar quantization and produce quantization are sufficiently different that using a common training interface feels overly restrictive.
§Thoughts on Canonical Layouts for Compressed Vector Representations
As much as possible, we try to avoid using raw &[u8] as the representation for
compressed vectors in favor of richer types such as crate::bits::BitSlice and
crate::scalar::CompensatedVectorRef (though product-quantization still uses this
for historical reasons). This is because compressed representations may need to contain
more than just a raw encoding (e.g. the compensation coefficient for scalar quantization)
and using a richer type makes this less error prone.
But this opens a can of worms regarding storage within the data provider. Lets take, for
example, crate::scalar::CompensatedVector, which consists of a slice for scalar
quantization codes and a compensation coefficient. The provider can choose to store
these two components in separate data structures or fuse them together into a single
block of memory (note that crate::scalar::MutCompensatedVectorRef is defined so that
both the codes and compensation coefficient can be updated in-place. The provider may
also wish to perform cache-line padding.
Because of the myriad of ways data can be stored and retrieved, the utilities in this crate instead focus on providing tools to express the compressed state that allow for this design space exploration on the data provider level.
Eventually, however, we may wish to impose canonical layouts for these representations
within the quantization crate if that turns out to be necessary.
Modules§
- algorithms
- alloc
- binary
- Binary quantization compression and distance comparisions.
- bits
- Utilities for working with packed 1-bit to 8-bit integers (inclusive).
- cancel
- Utilities to support cancelation of long-running operations.
- distances
- Zero-sized types representing distance functions.
- error
- Formatting utilities for error chains.
- meta
- minmax
- MinMax Quantization
- multi_
vector - Multi-vector matrix types and distance functions.
- num
- Number types with limited dynamic range.
- product
- Product quantization training and compression.
- random
- scalar
- Scalar quantization training, compression data, and distance comparisons.
- spherical
- views
Macros§
- check_
lengths - Check that
x.len() == y.len(), returningErr(diskann_vector::distances::UnequalLengths)if the results are different. - poly
Enums§
- Parallelism
- Selector for the parallelization strategy used by some algorithms.
Traits§
- AsFunctor
- Create a distance function (i.e.
diskann_vector::DistanceFunction) from a quantizer. - Compress
Into - A trait implemented by a quantizer, indicating that it is capable of quantizing the
contents oF
FromintoTo. - Compress
Into With - A trait implemented by a quantizer, indicating that it is capable of quantizing the
contents oF
FromintoTowith additional help from an argument of typeA.