simd_vector 0.1.0

SIMD vector types (Vec4, Vec8) for x86-64 in pure stable Rust — SSE4.1, AVX2, FMA3
Documentation

simd_vector

SIMD vector types for x86-64 in pure stable Rust.

  • Vec4 — 4×f32, SSE4.1 + FMA3
  • Vec8 — 8×f32, AVX2 + FMA3

Features

  • Arithmetic: +, -, *, / (vec×vec, vec×f32, f32×vec)
  • splat, abs, neg, sqrt, floor, dot, mul_add (FMA)
  • Sum, Index, From/Into array, Clone, Copy, Debug, PartialEq
  • Transcendentals: sin, cos, exp — ported from SLEEF

Transcendental accuracy

sin and cos use SLEEF's xsinf_u1 / xcosf_u1 algorithms (ULP < 1 variants). exp uses SLEEF's xexpf.

Function Algorithm ULP error Range
sin Cody-Waite + Payne-Hanek + double-float polynomial ≤ 1.0 all finite f32
cos Cody-Waite + Payne-Hanek + double-float polynomial ≤ 1.0 all finite f32
exp ln(2) range reduction + degree-6 polynomial + ldexp ≤ 1.0 all finite f32

Key implementation details:

  • Range reduction for sin/cos: Cody-Waite (3-constant) for |x| < 125, Payne-Hanek table-based (rempif) for larger arguments
  • Double-float arithmetic: FMA-based error-free transformations for high precision in the polynomial evaluation
  • Edge cases: NaN/Inf propagation, sin(-0) = -0, exp(-inf) = 0, exp(inf) = inf
  • Polynomial coefficients, constants (PI_A2f, PI_B2f, PI_C2f, L2Uf, L2Lf, etc.), and the 416-entry Sleef_rempitabsp table are taken directly from SLEEF

Required CPU features

SSE4.1, AVX2, FMA3. Enabled globally via .cargo/config.toml:

[build]
rustflags = ["-C", "target-feature=+sse4.1,+avx2,+fma"]

Usage

use simd_vector::{Vec4, Vec8};

let a = Vec4([1.0, 2.0, 3.0, 4.0]);
let b = Vec4([5.0, 6.0, 7.0, 8.0]);
let c = a + b;              // [6.0, 8.0, 10.0, 12.0]
let d = a.dot(b);           // 70.0
let s = a.sin();            // per-lane sin
let e = Vec8::splat(1.0).exp(); // [e, e, e, e, e, e, e, e]

Tests

189 tests covering all operations, edge cases (NaN, Inf, -0.0, subnormals), and ULP sweep verification over the full range.

cargo test

Coverage (via cargo llvm-cov):

Filename   Regions  Missed  Cover   Functions  Missed  Cover   Lines  Missed  Cover
lib.rs     142      0       100.00% 6          0       100.00% 61     0       100.00%
vec4.rs    892      0       100.00% 48         0       100.00% 437    0       100.00%
vec8.rs    915      0       100.00% 48         0       100.00% 441    0       100.00%
TOTAL      1949     0       100.00% 102        0       100.00% 939    0       100.00%

License

MIT