Expand description
SIMD-Accelerated Distance Functions
This module provides SIMD-optimized implementations of vector distance calculations using CPU intrinsics (AVX2 on x86_64). Functions automatically dispatch to SIMD or scalar implementations based on runtime CPU feature detection.
§Architecture
- Scalar fallback: Pure Rust implementation, always available
- AVX2 path: x86_64 intrinsics with 256-bit registers (8 floats per iteration)
- Runtime dispatch: One-time CPU feature detection with cached result
§Safety Guarantees
All unsafe blocks are contained within this module and only use:
- Unaligned loads (
_mm256_loadu_ps) - no alignment requirements - Standard SIMD intrinsics - well-defined behavior for any f32 input
- Proper remainder handling - scalar loop processes trailing elements
§Performance Characteristics
§AVX2 (256-bit)
- Throughput: 8 floats per iteration
- Speedup: ~4-6x for large vectors vs scalar (depends on FMA availability)
- Latency: Similar to scalar for small vectors (< 8 elements)
§Scalar Fallback
- Throughput: 1 float per iteration
- Availability: All platforms, all CPUs
- Performance: Baseline, optimized Rust code
§Correctness
SIMD and scalar implementations produce bit-identical results for the same inputs. All operations follow IEEE 754 floating-point semantics.
§Examples
use sqlitegraph::hnsw::simd::dot_product;
let a = vec![1.0, 2.0, 3.0];
let b = vec![4.0, 5.0, 6.0];
let product = dot_product(&a, &b);
assert_eq!(product, 32.0);Functions§
- compute_
norm_ squared - Runtime-dispatched squared norm computation with AVX2 acceleration
- compute_
norm_ squared_ scalar - Scalar fallback implementation of squared norm computation
- cosine_
similarity - Runtime-dispatched cosine similarity with AVX2 acceleration
- cosine_
similarity_ scalar - Scalar fallback implementation of cosine similarity
- dot_
product - Runtime-dispatched dot product with AVX2 acceleration
- dot_
product_ scalar - Scalar fallback implementation of dot product
- euclidean_
distance - Runtime-dispatched Euclidean (L2) distance computation
- euclidean_
distance_ scalar - Scalar fallback implementation of Euclidean (L2) distance