Expand description
Shared SIMD-accelerated functions for posting list compression
This module provides platform-optimized implementations for common operations:
- Unpacking: Convert packed 8/16/32-bit values to u32 arrays
- Delta decoding: Prefix sum for converting deltas to absolute values
- Add one: Increment all values in an array (for TF decoding)
Supports:
- NEON on aarch64 (Apple Silicon, ARM servers)
- SSE/SSE4.1 on x86_64 (Intel/AMD)
- Scalar fallback for other architectures
Enums§
- Rounded
BitWidth - Rounded bit width type for SIMD-friendly encoding
Functions§
- add_one
- Add 1 to all values with SIMD acceleration
- bits_
needed - Compute the number of bits needed to represent a value
- delta_
decode - Delta decode with SIMD acceleration
- dequantize_
uint8 - Dequantize UInt8 weights to f32 with SIMD acceleration
- dot_
product_ f32 - Compute dot product of two f32 arrays with SIMD acceleration
- max_f32
- Find maximum value in f32 array with SIMD acceleration
- pack_
rounded - Pack values using rounded bit width (SIMD-friendly)
- round_
bit_ width - Round a bit width to the nearest SIMD-friendly width (0, 8, 16, or 32)
- unpack_
8bit - Unpack 8-bit packed values to u32 with SIMD acceleration
- unpack_
8bit_ delta_ decode - Fused unpack 8-bit + delta decode in a single pass
- unpack_
16bit - Unpack 16-bit packed values to u32 with SIMD acceleration
- unpack_
16bit_ delta_ decode - Fused unpack 16-bit + delta decode in a single pass
- unpack_
32bit - Unpack 32-bit packed values to u32 with SIMD acceleration
- unpack_
delta_ decode - Fused unpack + delta decode for arbitrary bit widths
- unpack_
rounded - Unpack values using rounded bit width with SIMD acceleration
- unpack_
rounded_ delta_ decode - Fused unpack + delta decode using rounded bit width