Expand description
§Low-level functions
The methods here are meant to be primitives used by the distance functions for the various scalar-quantized-like quantizers.
As such, they typically return integer distance results since they largely operate over raw bit-slices.
§Micro-architecture Mapping
There are two interfaces for interacting with the distance primitives:
-
diskann_wide::arch::Target2: A micro-architecture aware interface where the target micro-architecture is provided as an explicit argument.This can be used in conjunction with
diskann_wide::Architecture::run2to apply the necessary target-features to opt-into newer architecture code generation when compiling the whole binary for an older architecture.This interface is also composable with micro-architecture dispatching done higher in the callstack, and so should be preferred when incorporating into quantizer distance computations.
-
diskann_vector::PureDistanceFunction: If micro-architecture awareness is not needed, this provides a simple interface targetingdiskann_wide::ARCH(the current compilation architecture).This interface will always yield a binary compatible with the compilation architecture target, but will not enable faster code-paths when compiling for older architectures.
The following table summarizes the implementation status of kernels. All kernels have
diskann_wide::arch::Scalar implementation fallbacks.
Implementation Kind:
-
“Fallback”: A fallback implementation using scalar indexing.
-
“Optimized”: A better implementation than “fallback” that does not contain target-depeendent code, instead relying on compiler optimizations.
Micro-architecture dispatch is still relevant as it allows the compiler to generate better code for newer machines.
-
“Yes”: Architecture specific SIMD implementation exists.
-
“No”: Architecture specific implementation does not exist - the next most-specific implementation is used. For example, if a
x86-64-v3implementation does not exist, then the “scalar” implementation will be used instead.
Type Aliases
-
USlice<N>:BitSlice<N, Unsigned, Dense> -
TSlice<N>:BitSlice<N, Unsigned, BitTranspose> -
BSlice:BitSlice<1, Binary, Dense>
§Inner Product
| LHS | RHS | Result | Scalar | x86-64-v3 | x86-64-v4 | Neon |
|---|---|---|---|---|---|---|
USlice<1> | USlice<1> | MV<u32> | Optimized | Optimized | Uses V3 | Optimized |
USlice<2> | USlice<2> | MV<u32> | Fallback | Yes | Yes | Fallback |
USlice<3> | USlice<3> | MV<u32> | Fallback | No | Uses V3 | Fallback |
USlice<4> | USlice<4> | MV<u32> | Fallback | Yes | Uses V3 | Fallback |
USlice<5> | USlice<5> | MV<u32> | Fallback | No | Uses V3 | Fallback |
USlice<6> | USlice<6> | MV<u32> | Fallback | No | Uses V3 | Fallback |
USlice<7> | USlice<7> | MV<u32> | Fallback | No | Uses V3 | Fallback |
USlice<8> | USlice<8> | MV<u32> | Yes | Yes | Yes | Fallback |
| ||||||
TSlice<4> | USlice<1> | MV<u32> | Optimized | Optimized | Optimized | Optimized |
| ||||||
&[f32] | USlice<1> | MV<f32> | Fallback | Yes | Uses V3 | Fallback |
&[f32] | USlice<2> | MV<f32> | Fallback | Yes | Uses V3 | Fallback |
&[f32] | USlice<3> | MV<f32> | Fallback | No | Uses V3 | Fallback |
&[f32] | USlice<4> | MV<f32> | Fallback | Yes | Uses V3 | Fallback |
&[f32] | USlice<5> | MV<f32> | Fallback | No | Uses V3 | Fallback |
&[f32] | USlice<6> | MV<f32> | Fallback | No | Uses V3 | Fallback |
&[f32] | USlice<7> | MV<f32> | Fallback | No | Uses V3 | Fallback |
&[f32] | USlice<8> | MV<f32> | Fallback | No | Uses V3 | Fallback |
§Squared L2
| LHS | RHS | Result | Scalar | x86-64-v3 | x86-64-v4 | Neon |
|---|---|---|---|---|---|---|
USlice<1> | USlice<1> | MV<u32> | Optimized | Optimized | Uses V3 | Optimized |
USlice<2> | USlice<2> | MV<u32> | Fallback | Yes | Uses V3 | Fallback |
USlice<3> | USlice<3> | MV<u32> | Fallback | No | Uses V3 | Fallback |
USlice<4> | USlice<4> | MV<u32> | Fallback | Yes | Uses V3 | Fallback |
USlice<5> | USlice<5> | MV<u32> | Fallback | No | Uses V3 | Fallback |
USlice<6> | USlice<6> | MV<u32> | Fallback | No | Uses V3 | Fallback |
USlice<7> | USlice<7> | MV<u32> | Fallback | No | Uses V3 | Fallback |
USlice<8> | USlice<8> | MV<u32> | Yes | Yes | Yes | Fallback |
§Hamming
| LHS | RHS | Result | Scalar | x86-64-v3 | x86-64-v4 | Neon |
|---|---|---|---|---|---|---|
BSlice | BSlice | MV<u32> | Optimized | Optimized | Uses V3 | Optimized |