Expand description
SIMD-accelerated vector operations for CPU fallback paths
Provides vectorized element-wise add, multiply, scale, dot product, and
reduction operations. Architecture-specific implementations are selected
at compile time via cfg(target_arch), with a scalar fallback for
unsupported platforms.
Functionsยง
- vector_
add_ f32 - Element-wise addition:
c[i] = a[i] + b[i] - vector_
dot_ f32 - Dot product:
sum(a[i] * b[i]) - vector_
mul_ f32 - Element-wise multiplication:
c[i] = a[i] * b[i] - vector_
reduce_ sum_ f32 - Sum reduction:
sum(a[i]) - vector_
scale_ f32 - Scale every element:
c[i] = a[i] * scalar