Module ops

Expand description

Core SIMD-accelerated operations: dot product, matrix-vector multiply, and activation functions.

These are the hottest primitives across SSM, ESN, and attention/neural forward passes. AVX2 processes 4 f64 values per cycle, giving up to ~4x throughput on aligned inner loops.

§Architecture

Public API (safe)            Internal dispatch
─────────────────            ─────────────────
simd_dot(a, b)        ──►   avx2::dot_avx2      (x86_64 + AVX2 detected)
                      └──►  dot_scalar           (fallback)

simd_mat_vec(w,x,..)  ──►   avx2::mat_vec_avx2  (x86_64 + AVX2 detected)
                      └──►  mat_vec_scalar       (fallback)

simd_tanh(in, out)    ──►   avx2::tanh_avx2     (x86_64 + AVX2, Padé [2,2])
                      └──►  tanh_scalar          (fallback)

simd_exp(in, out)     ──►   avx2::exp_avx2      (x86_64 + AVX2, range-reduced deg-5)
                      └──►  exp_scalar           (fallback)

simd_sigmoid(in, out) ──►   avx2::sigmoid_avx2  (x86_64 + AVX2, via exp)
                      └──►  sigmoid_scalar       (fallback)

simd_silu(in, out)    ──►   avx2::silu_avx2     (x86_64 + AVX2, via sigmoid)
                      └──►  silu_scalar          (fallback)

Functions§

simd_dot: SIMD-accelerated dot product with runtime feature detection.
simd_exp: SIMD-accelerated element-wise exp with runtime feature detection.
simd_mat_vec: SIMD-accelerated matrix-vector multiply with runtime feature detection.
simd_sigmoid: SIMD-accelerated element-wise sigmoid with runtime feature detection.
simd_silu: SIMD-accelerated element-wise SiLU (Sigmoid Linear Unit) with runtime feature detection.
simd_tanh: SIMD-accelerated element-wise tanh with runtime feature detection.

Module ops

Module ops Copy item path

§Architecture

Functions§

Module ops