Skip to main content

Module distances

Module distances 

Source
Expand description

§Low-level functions

The methods here are meant to be primitives used by the distance functions for the various scalar-quantized-like quantizers.

As such, they typically return integer distance results since they largely operate over raw bit-slices.

§Micro-architecture Mapping

There are two interfaces for interacting with the distance primitives:

  • diskann_wide::arch::Target2: A micro-architecture aware interface where the target micro-architecture is provided as an explicit argument.

    This can be used in conjunction with diskann_wide::Architecture::run2 to apply the necessary target-features to opt-into newer architecture code generation when compiling the whole binary for an older architecture.

    This interface is also composable with micro-architecture dispatching done higher in the callstack, and so should be preferred when incorporating into quantizer distance computations.

  • diskann_vector::PureDistanceFunction: If micro-architecture awareness is not needed, this provides a simple interface targeting diskann_wide::ARCH (the current compilation architecture).

    This interface will always yield a binary compatible with the compilation architecture target, but will not enable faster code-paths when compiling for older architectures.

The following table summarizes the implementation status of kernels. All kernels have diskann_wide::arch::Scalar implementation fallbacks.

Implementation Kind:

  • “Fallback”: A fallback implementation using scalar indexing.

  • “Optimized”: A better implementation than “fallback” that does not contain target-depeendent code, instead relying on compiler optimizations.

    Micro-architecture dispatch is still relevant as it allows the compiler to generate better code for newer machines.

  • “Yes”: Architecture specific SIMD implementation exists.

  • “No”: Architecture specific implementation does not exist - the next most-specific implementation is used. For example, if a x86-64-v3 implementation does not exist, then the “scalar” implementation will be used instead.

Type Aliases

§Inner Product

LHSRHSResultScalarx86-64-v3x86-64-v4Neon
USlice<1>USlice<1>MV<u32>OptimizedOptimizedUses V3Optimized
USlice<2>USlice<2>MV<u32>FallbackYesYesFallback
USlice<3>USlice<3>MV<u32>FallbackNoUses V3Fallback
USlice<4>USlice<4>MV<u32>FallbackYesUses V3Fallback
USlice<5>USlice<5>MV<u32>FallbackNoUses V3Fallback
USlice<6>USlice<6>MV<u32>FallbackNoUses V3Fallback
USlice<7>USlice<7>MV<u32>FallbackNoUses V3Fallback
USlice<8>USlice<8>MV<u32>YesYesYesFallback
TSlice<4>USlice<1>MV<u32>OptimizedOptimizedOptimizedOptimized
&[f32]USlice<1>MV<f32>FallbackYesUses V3Fallback
&[f32]USlice<2>MV<f32>FallbackYesUses V3Fallback
&[f32]USlice<3>MV<f32>FallbackNoUses V3Fallback
&[f32]USlice<4>MV<f32>FallbackYesUses V3Fallback
&[f32]USlice<5>MV<f32>FallbackNoUses V3Fallback
&[f32]USlice<6>MV<f32>FallbackNoUses V3Fallback
&[f32]USlice<7>MV<f32>FallbackNoUses V3Fallback
&[f32]USlice<8>MV<f32>FallbackNoUses V3Fallback

§Squared L2

LHSRHSResultScalarx86-64-v3x86-64-v4Neon
USlice<1>USlice<1>MV<u32>OptimizedOptimizedUses V3Optimized
USlice<2>USlice<2>MV<u32>FallbackYesUses V3Fallback
USlice<3>USlice<3>MV<u32>FallbackNoUses V3Fallback
USlice<4>USlice<4>MV<u32>FallbackYesUses V3Fallback
USlice<5>USlice<5>MV<u32>FallbackNoUses V3Fallback
USlice<6>USlice<6>MV<u32>FallbackNoUses V3Fallback
USlice<7>USlice<7>MV<u32>FallbackNoUses V3Fallback
USlice<8>USlice<8>MV<u32>YesYesYesFallback

§Hamming

LHSRHSResultScalarx86-64-v3x86-64-v4Neon
BSliceBSliceMV<u32>OptimizedOptimizedUses V3Optimized