Skip to main content

parallel_sgemv

Function parallel_sgemv 

Source
pub fn parallel_sgemv(
    matrix: &Array2<f32>,
    vector: &ArrayView1<'_, f32>,
) -> Array1<f32>
Expand description

Parallel matrix-vector multiply via row-sharded BLAS sgemv.

See call site in search_semantic for the rationale; in short, Accelerate’s level-2 BLAS is single-threaded on macOS, so we shard the matrix into row-chunks and call sgemv per worker to saturate aggregate memory bandwidth.

§Panics

Panics if ndarray returns a non-contiguous slice from Array2::slice(s![start..end, ..]). Row slices of a row-major matrix are always contiguous, so this is structurally unreachable; the panic guards against future layout changes that would silently break correctness.