Expand description
SIMD vectorization module for high-performance batch operations.
Provides optimized implementations of common mathematical operations for processing 8 values in parallel, matching Tier 2 batch size.
§Strategy
Uses Rust’s simd via the wide crate for f64x2-based vectorization.
8 f64 values = 4 f64x2 vectors processed in parallel.
Hardware: SSE2 minimum, optimizes for AVX on modern x86_64.
§Performance
- True SIMD: Actual vector hardware instructions via wide crate
- f64x2 provides good vectorization across all architectures
- Expected 5-15% speedup from ILP and vectorization
- Trigonometric operations: primary bottleneck, now vectorized
- GPU code already vectorizing color mapping
Functions§
- simd_
abs_ 8 - Vectorized absolute value using vector SIMD
- simd_
acos_ 8 - Vectorized inverse cosine for 8 f64 values
- simd_
add_ 8 - Vectorized element-wise addition using vector SIMD
- simd_
asin_ 8 - Vectorized inverse sine for 8 f64 values
- simd_
asinh_ scale_ 8 - Vectorized asinh scale transformation (handles positive and negative data)
- simd_
atan2_ 8 - Vectorized inverse tangent (atan2) for 8 point pairs
- simd_
batch_ scale_ 8 - Batch process 8 raw values through scaling operation with caching
- simd_
batch_ scale_ 16 - Batch scale 16 values with validity masking (for 16-pixel rendering)
- simd_
clamp_ 8 - Vectorized clamp operation using vector SIMD
- simd_
colormap_ sample_ 8 - Vectorized colormap LUT lookup (fast palette sampling)
- simd_
cos_ 8 - Vectorized cosine for 8 f64 values
- simd_
cross_ 8 - Vectorized 3D cross product
- simd_
dot3_ 8 - Vectorized 3D dot product
- simd_
gamma_ correct_ 8 - Vectorized gamma correction for 8 values
- simd_
linear_ scale_ 8 - Vectorized linear scaling (normalization to [0, 1] range)
- simd_
ln_ 8 - Vectorized natural logarithm
- simd_
log_ scale_ 8 - Vectorized log scale transformation (for positive data)
- simd_
madd_ 8 - Vectorized fused multiply-add: result = a * b + c using vector SIMD
- simd_
matvec3_ 8 - Vectorized 3x3 matrix-vector multiplication (8 vectors)
- simd_
mul_ 8 - Vectorized element-wise multiplication using vector SIMD
- simd_
normalize_ vec3_ 8 - Vectorized 3D vector normalization
- simd_
plancklog_ scale_ 8 - Vectorized PlanckLog scale transformation
- simd_
pow_ 8 - Vectorized power function (y = x^exp)
- simd_
recip_ 8 - Vectorized reciprocal (1/x) using vector SIMD
- simd_
sin_ 8 - Vectorized sine for 8 f64 values
- simd_
sin_ cos_ 8 - Vectorized sine and cosine simultaneously (more efficient than separate calls)
- simd_
sin_ cos_ 16 - Vectorized sin_cos for 16 f64 values
- simd_
sph_ to_ vec_ 8 - Vectorized spherical to Cartesian conversion (8 theta-phi pairs)
- simd_
sqrt_ 8 - Vectorized square root using vector SIMD
- simd_
symlog_ scale_ 8 - Vectorized symlog scale transformation (supports negative values)
- simd_
to_ pixel_ values - Convert SIMD linear scale results to PixelValue enum array
- simd_
to_ pixel_ values_ 16 - Convert 16 SIMD scaling results to PixelValue array
- simd_
vec_ to_ sph_ 8 - Vectorized Cartesian to spherical conversion (8 vectors)