diskann_quantization/lib.rs
1/*
2 * Copyright (c) Microsoft Corporation.
3 * Licensed under the MIT license.
4 */
5
6#![cfg_attr(docsrs, feature(doc_cfg))]
7
8//! A collection of utilities and algorithms for training quantizers, compressing data,
9//! and performing distance computations on compressed data.
10//!
11//! # Quantizers
12//!
13//! The currently implemented quantizers are listed below. Refer to the documentation
14//! within each module for more information.
15//!
16//! Note that the capabilities of each quantizer are varied and some aspects are still in
17//! progress. If you have any questions or need something implemented, please reach out!
18//!
19//! * [Scalar](crate::scalar): Compressing dimensions to 8-bits and lower.
20//! * [Binary](crate::binary): Light weight 1-bit compression.
21//! * [Product](crate::product): Compressing multiple dimensions into a single code.
22//! **NOTE**: Support for product quantization is currently limited as the implementation
23//! is spread between this crate and the original implementation in DiskANN.
24//! * [Spherical](crate::spherical): A compression technique similar to scalar quantization,
25//! put with the characteristic that vectors are normalized, transformed, and then
26//! projected onto a unit sphere.
27//!
28//! This class of algorithms is based off the [RabitQ](https://arxiv.org/abs/2405.12497)
29//! paper and associated work.
30//! * [MinMax](crate::minmax): A compression similar to scalar quantization, but where the
31//! the coefficients to quantize are computed on a per-vector basis; allowing this to be
32//! used in a streaming setting (i.e. without any need for training).
33//!
34//!
35//! # [BitSlice](crate::bits::BitSliceBase)
36//!
37//! A primary abstraction introduced is a BitSlice. Think of it as a Rust slice, but for
38//! packing together bits. It currently supports unsigned integers of widths 1 to 8.
39//! The borrowed version of the slice consists of a pointer (with a lifetime) and a length
40//! and is therefore just 16-bytes in size and amenable to the niche optimization.
41//! It also comes in a boxed variant for convenience.
42//!
43//! Some trickery is involved in the constructor to ensure that the pointer can only ever be
44//! obtained in a safe way.
45//!
46//! ## Distances over BitSlices
47//!
48//! Distances over bit-slices are a key-component for building efficient distances for
49//! [binary](crate::binary::BinaryQuantizer) and [scalar](crate::scalar::ScalarQuantizer)
50//! quantized vectors.
51//!
52//! Performance characteristics of the implemented functions are listed here:
53//!
54//! * [`crate::distances::SquaredL2`]/[`crate::distances::InnerProduct`]: Fast SIMD
55//! implementations available for 4 and 8-bit vectors. Fallback implementations are used
56//! for 2, 3, 5, 6, and 7-bit vectors. Scalar pop-count used for 1-bit distances.
57//!
58//! * [`crate::distances::Hamming`]: Scalar pop-count used.
59//!
60//! # Adding New Quantizers
61//!
62//! Adding new quantizers is a fairly straight-forward.
63//!
64//! 1. Implement training using a similar style to existing quantizers by:
65//!
66//! A. Define a quantizer that will hold the final any central quantization state.
67//!
68//! B. Implement a training module with training parameters and an associated `train`
69//! method (available either as a trait or inherhent method, whichever makes more
70//! sense).
71//!
72//! 2. Determine the representation for compressed vectors. This can be done using
73//! combinations of [`crate::bits::BitSlice`]s, utilities in the `ownership` module and
74//! more.
75//!
76//! Ensure that this representation implements [`diskann_utils::Reborrow`] and
77//! [`diskann_utils::ReborrowMut`] so that working with generalized references types is sane.
78//!
79//! 3. Implement distance functors (e.g. [`crate::scalar::CompensatedIP`]) to enable distance
80//! computations on compressed vectors and between compress and full-precision vectors.
81//!
82//! Additionally, implement [`crate::AsFunctor`] for the central quantizer for each
83//! distance functor.
84//!
85//! # Design Considerations
86//!
87//! In this section, we will cover some of the design philosophy regarding quantizers,
88//! particularly how this interacts with vector search and data providers.
89//!
90//! ## No Inherent Methods for Distances in Quantizers
91//!
92//! Inherent methods for distances on quantizer structs such as
93//!
94//! * `fn distance_l2(&self, full_precision: &[f32], quantized: &[u8]) -> f32`
95//! * `fn distance_qq_l2(&self, quantized_0: &[u8], quantized_1: &[u8]) -> f32`
96//!
97//! do not work well for a number of reasons:
98//!
99//! 1. Inherent methods are unavailable to call via trait restrictions on generic arguments.
100//!
101//! 2. Fixed argument/return types limit flexibility. One example is marking the difference
102//! between [`diskann_vector::MathematicalValue`] and [`diskann_vector::SimilarityScore`].
103//!
104//! The solution to accepting multiple argument/return types is exactly the approach
105//! taken in this crate, i.e. using auxiliary [`diskann_vector::DistanceFunction`]
106//! implementations on light-weight types (e.g. [`crate::scalar::CompensatedIP`]).
107//!
108//! 3. This approach does not work when hefty query preprocessing can make distance
109//! computations faster, such as with table-based PQ implementations.
110//!
111//! Instead, creating a [`diskann_vector::PreprocessedDistanceFunction`] can be used.
112//!
113//! 4. Does not really extend well to performing batches of distance computations when that
114//! can be computed more efficiently.
115//!
116//! ## No Common Trait for Quantizer-based Distances
117//!
118//! Creating a trait for quantizer-based distances would solve point number 1 above about
119//! usability in generic contexts, but the other points still remain. I argue that a
120//! functor-based interface is generally more flexible and useful.
121//!
122//! Additionally, not all quantizers implement the same distance functions. For example,
123//! binary quantization only uses [`crate::distances::Hamming`].
124//!
125//! ## No Common Trait for Training
126//!
127//! The requirements for training between [scalar quantization](crate::scalar) and
128//! [produce quantization](crate::product) are sufficiently different that using a common
129//! training interface feels overly restrictive.
130//!
131//! ## Thoughts on Canonical Layouts for Compressed Vector Representations
132//!
133//! As much as possible, we try to avoid using raw `&[u8]` as the representation for
134//! compressed vectors in favor of richer types such as [`crate::bits::BitSlice`] and
135//! [`crate::scalar::CompensatedVectorRef`] (though product-quantization still uses this
136//! for historical reasons). This is because compressed representations may need to contain
137//! more than just a raw encoding (e.g. the compensation coefficient for scalar quantization)
138//! and using a richer type makes this less error prone.
139//!
140//! But this opens a can of worms regarding storage within the data provider. Lets take, for
141//! example, [`crate::scalar::CompensatedVector`], which consists of a slice for scalar
142//! quantization codes and a compensation coefficient. The provider can choose to store
143//! these two components in separate data structures or fuse them together into a single
144//! block of memory (note that [`crate::scalar::MutCompensatedVectorRef`] is defined so that
145//! both the codes and compensation coefficient can be updated in-place. The provider may
146//! **also** wish to perform cache-line padding.
147//!
148//! Because of the myriad of ways data can be stored and retrieved, the utilities in this
149//! crate instead focus on providing tools to express the compressed state that allow for
150//! this design space exploration on the data provider level.
151//!
152//! Eventually, however, we may wish to impose canonical layouts for these representations
153//! within the `quantization` crate if that turns out to be necessary.
154
155mod ownership;
156mod utils;
157
158// misc
159pub mod alloc;
160pub mod bits;
161pub mod cancel;
162pub mod distances;
163pub mod error;
164pub mod meta;
165pub mod num;
166pub mod random;
167mod traits;
168pub mod views;
169
170pub use traits::{AsFunctor, CompressInto, CompressIntoWith};
171
172// serialization
173#[cfg(feature = "flatbuffers")]
174#[allow(mismatched_lifetime_syntaxes)] // The generated code isn't clippy-clean.
175pub(crate) mod flatbuffers;
176
177// common algorithms
178pub mod algorithms;
179
180// quantization
181pub mod binary;
182pub mod minmax;
183pub mod multi_vector;
184pub mod product;
185pub mod scalar;
186pub mod spherical;
187
188/// Selector for the parallelization strategy used by some algorithms.
189#[derive(Debug, Copy, Clone, Default, PartialEq, Eq)]
190#[non_exhaustive]
191pub enum Parallelism {
192 /// Use single-threaded execution.
193 #[default]
194 Sequential,
195
196 /// Use Rayon based parallelism in the dynamically scoped Rayon thread pool.
197 #[cfg(feature = "rayon")]
198 #[cfg_attr(docsrs, doc(cfg(feature = "rayon")))]
199 Rayon,
200}
201
202// The code-generation module needs to be public to prevent symbols from getting culled.
203#[doc(hidden)]
204#[cfg(feature = "codegen")]
205pub mod __codegen;
206
207// Miri can't handle `trybuild`.
208#[cfg(all(test, not(miri)))]
209mod tests {
210 #[test]
211 fn compile_tests() {
212 let t = trybuild::TestCases::new();
213 // Begin with a `pass` test to force full compilation of all the test binaries.
214 //
215 // This ensures that post-monomorphization errors are tested.
216 t.pass("tests/compile-fail/bootstrap/bootstrap.rs");
217 t.compile_fail("tests/compile-fail/*.rs");
218 t.compile_fail("tests/compile-fail/error/*.rs");
219 t.compile_fail("tests/compile-fail/multi/*.rs");
220 }
221}
222
223#[cfg(test)]
224mod test_util;