Module simd

Source

Expand description

Shared SIMD-accelerated functions for posting list compression

This module provides platform-optimized implementations for common operations:

Unpacking: Convert packed 8/16/32-bit values to u32 arrays
Delta decoding: Prefix sum for converting deltas to absolute values
Add one: Increment all values in an array (for TF decoding)

Supports:

NEON on aarch64 (Apple Silicon, ARM servers)
SSE/SSE4.1 on x86_64 (Intel/AMD)
Scalar fallback for other architectures

Enums§

RoundedBitWidth: Rounded bit width type for SIMD-friendly encoding

Functions§

add_one: Add 1 to all values with SIMD acceleration
batch_cosine_scores: Batch cosine similarity: query vs N contiguous vectors.
batch_cosine_scores_f16: Batch cosine similarity: f32 query vs N contiguous f16 vectors.
batch_cosine_scores_u8: Batch cosine similarity: f32 query vs N contiguous u8 vectors.
batch_f32_to_f16: Batch convert f32 slice to f16 (stored as u16)
batch_f32_to_u8: Batch convert f32 slice to u8 with [-1,1] → [0,255] mapping
batch_squared_euclidean_distances: Batch compute squared Euclidean distances from one query to multiple vectors
bits_needed: Compute the number of bits needed to represent a value
cosine_similarity: Compute cosine similarity between two f32 vectors with SIMD acceleration
delta_decode: Delta decode with SIMD acceleration
dequantize_uint8: Dequantize UInt8 weights to f32 with SIMD acceleration
dot_product_f32: Compute dot product of two f32 arrays with SIMD acceleration
f16_to_f32: Convert f16 (stored as u16) to f32
f32_to_f16: Convert f32 to f16 (IEEE 754 half-precision), stored as u16
f32_to_u8_saturating: Quantize f32 in [-1, 1] to u8 [0, 255]
max_f32: Find maximum value in f32 array with SIMD acceleration
pack_rounded: Pack values using rounded bit width (SIMD-friendly)
round_bit_width: Round a bit width to the nearest SIMD-friendly width (0, 8, 16, or 32)
squared_euclidean_distance: Compute squared Euclidean distance between two f32 vectors with SIMD acceleration
u8_to_f32: Dequantize u8 [0, 255] to f32 in [-1, 1]
unpack_8bit: Unpack 8-bit packed values to u32 with SIMD acceleration
unpack_8bit_delta_decode: Fused unpack 8-bit + delta decode in a single pass
unpack_16bit: Unpack 16-bit packed values to u32 with SIMD acceleration
unpack_16bit_delta_decode: Fused unpack 16-bit + delta decode in a single pass
unpack_32bit: Unpack 32-bit packed values to u32 with SIMD acceleration
unpack_delta_decode: Fused unpack + delta decode for arbitrary bit widths
unpack_rounded: Unpack values using rounded bit width with SIMD acceleration
unpack_rounded_delta_decode: Fused unpack + delta decode using rounded bit width

Module simd

Module simd Copy item path

Enums§

Functions§

Module simd