Skip to main content

simd_axpy

Function simd_axpy 

Source
pub fn simd_axpy(c: &mut [f64], b: &[f64], scalar: f64, len: usize)
Expand description

SIMD-accelerated AXPY: c[0..len] += scalar * b[0..len].

Used in the inner loop of tiled matrix multiplication where scalar = A[i,p] and b is a row segment of B. Processes 4 elements per iteration with AVX2.

Deterministic because each c[j] accumulates the same scalar * b[j] contribution using separate mul + add (no FMA), matching scalar behavior.