Skip to main content

Module ops

Module ops 

Source
Expand description

Core SIMD-accelerated operations: dot product and matrix-vector multiply.

These are the two hottest primitives across SSM, ESN, and attention forward passes. AVX2 processes 4 f64 values per cycle, giving up to ~4x throughput on aligned inner loops.

§Architecture

Public API (safe)           Internal dispatch
─────────────────           ─────────────────
simd_dot(a, b)       ──►   avx2::dot_avx2     (x86_64 + AVX2 detected)
                     └──►  dot_scalar          (fallback)

simd_mat_vec(w,x,..) ──►   avx2::mat_vec_avx2 (x86_64 + AVX2 detected)
                     └──►  mat_vec_scalar      (fallback)

Functions§

simd_dot
SIMD-accelerated dot product with runtime feature detection.
simd_mat_vec
SIMD-accelerated matrix-vector multiply with runtime feature detection.